Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OffloadBundler] Rework the ctor of OffloadTargetInfo to support AMDGPU's generic target #122629

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

shiltian
Copy link
Contributor

@shiltian shiltian commented Jan 12, 2025

The current parsing logic for the target string assumes it follows the format <kind>-<triple>-<target id>:<feature>, such as hipv4-amdgcn-amd-amdhsa-gfx1030:+xnack.
Specifically, it assumes that <target id> does not contain any -, relying on rsplit for parsing.
However, this assumption breaks for AMDGPU's generic targets, which may contain one or more -, such as gfx10-3-generic or gfx12-generic.
As a result, the existing approach using rstrip is no longer reliable.

This patch reworks the parsing logic to handle target strings more robustly, including support for generic targets.
The bundler now strictly requires a 4-field target triple.
Additionally, a new Python helper function has been added to config.py to normalize the target triple into the 4-field format when it is not, ensuring tests pass reliably.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Jan 12, 2025
Copy link
Contributor Author

shiltian commented Jan 12, 2025

@llvmbot
Copy link
Member

llvmbot commented Jan 12, 2025

@llvm/pr-subscribers-testing-tools
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-driver

Author: Shilei Tian (shiltian)

Changes

The current parsing of target string assumes to be in a form of
kind-triple-targetid:feature, such as
hipv4-amdgcn-amd-amdhsa-gfx1030:+xnack. Specifically, the target id does not
contain any -, which is not the case for generic target. Also, a generic
target may contain one or more -, such as gfx10-3-generic and
gfx12-generic. As a result, we can no longer depend on rstrip to get things
work right. This patch reworks the logic to parse the target string to make it
more robust, as well as supporting generic target.


Full diff: https://github.com/llvm/llvm-project/pull/122629.diff

6 Files Affected:

  • (modified) clang/docs/ClangOffloadBundler.rst (+3-4)
  • (modified) clang/lib/Driver/OffloadBundler.cpp (+14-26)
  • (modified) clang/test/Driver/clang-offload-bundler-asserts-on.c (+7-7)
  • (modified) clang/test/Driver/clang-offload-bundler-standardize.c (+6-7)
  • (modified) clang/test/Driver/clang-offload-bundler-zstd.c (+6-6)
  • (modified) clang/test/Driver/clang-offload-bundler.c (+10-10)
diff --git a/clang/docs/ClangOffloadBundler.rst b/clang/docs/ClangOffloadBundler.rst
index 3c241027d405ca..25214c2ea6a4e1 100644
--- a/clang/docs/ClangOffloadBundler.rst
+++ b/clang/docs/ClangOffloadBundler.rst
@@ -266,15 +266,14 @@ without differentiation based on offload kind.
     The target triple of the code object. See `Target Triple
     <https://clang.llvm.org/docs/CrossCompilation.html#target-triple>`_.
 
-    The bundler accepts target triples with or without the optional environment
-    field:
+    LLVM target triples can be with or without the optional environment field:
 
     ``<arch><sub>-<vendor>-<sys>``, or
     ``<arch><sub>-<vendor>-<sys>-<env>``
 
     However, in order to standardize outputs for tools that consume bitcode
-    bundles, bundles written by the bundler internally use only the 4-field
-    target triple:
+    bundles, the bundler only accepts target triples with the 4-field target
+    triple:
 
     ``<arch><sub>-<vendor>-<sys>-<env>``
 
diff --git a/clang/lib/Driver/OffloadBundler.cpp b/clang/lib/Driver/OffloadBundler.cpp
index 2d6bdff0393be5..474b5a0fa4148f 100644
--- a/clang/lib/Driver/OffloadBundler.cpp
+++ b/clang/lib/Driver/OffloadBundler.cpp
@@ -84,31 +84,19 @@ OffloadTargetInfo::OffloadTargetInfo(const StringRef Target,
     : BundlerConfig(BC) {
 
   // TODO: Add error checking from ClangOffloadBundler.cpp
-  auto TargetFeatures = Target.split(':');
-  auto TripleOrGPU = TargetFeatures.first.rsplit('-');
-
-  if (clang::StringToOffloadArch(TripleOrGPU.second) !=
-      clang::OffloadArch::UNKNOWN) {
-    auto KindTriple = TripleOrGPU.first.split('-');
-    this->OffloadKind = KindTriple.first;
-
-    // Enforce optional env field to standardize bundles
-    llvm::Triple t = llvm::Triple(KindTriple.second);
-    this->Triple = llvm::Triple(t.getArchName(), t.getVendorName(),
-                                t.getOSName(), t.getEnvironmentName());
-
-    this->TargetID = Target.substr(Target.find(TripleOrGPU.second));
-  } else {
-    auto KindTriple = TargetFeatures.first.split('-');
-    this->OffloadKind = KindTriple.first;
-
-    // Enforce optional env field to standardize bundles
-    llvm::Triple t = llvm::Triple(KindTriple.second);
-    this->Triple = llvm::Triple(t.getArchName(), t.getVendorName(),
-                                t.getOSName(), t.getEnvironmentName());
-
-    this->TargetID = "";
-  }
+  // <kind>-<triple>[-<target id>]
+  // <tiple> := <arch>-<vendor>-<os>-<env>
+  SmallVector<StringRef, 6> Components;
+  Target.split(':').first.split(Components, '-', /*MaxSplit=*/5);
+  assert((Components.size() == 5 || Components.size() == 6) &&
+         "malformed target string");
+
+  this->OffloadKind = Components.front();
+  this->TargetID = Components.size() == 6 ? Components.back() : "";
+  ArrayRef<StringRef> TripleSlice{&Components[1], /*length=*/4};
+  llvm::Triple T = llvm::Triple(llvm::join(TripleSlice, "-"));
+  this->Triple = llvm::Triple(T.getArchName(), T.getVendorName(), T.getOSName(),
+                              T.getEnvironmentName());
 }
 
 bool OffloadTargetInfo::hasHostKind() const {
@@ -148,7 +136,7 @@ bool OffloadTargetInfo::operator==(const OffloadTargetInfo &Target) const {
 }
 
 std::string OffloadTargetInfo::str() const {
-  return Twine(OffloadKind + "-" + Triple.str() + "-" + TargetID).str();
+  return Twine(OffloadKind + "-" + Triple.normalize() + "-" + TargetID).str();
 }
 
 static StringRef getDeviceFileExtension(StringRef Device,
diff --git a/clang/test/Driver/clang-offload-bundler-asserts-on.c b/clang/test/Driver/clang-offload-bundler-asserts-on.c
index 55060c2c42e734..eec30674107b25 100644
--- a/clang/test/Driver/clang-offload-bundler-asserts-on.c
+++ b/clang/test/Driver/clang-offload-bundler-asserts-on.c
@@ -15,20 +15,20 @@
 // Check code object compatibility for archive unbundling
 //
 // Create few code object bundles and archive them to create an input archive
-// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa-gfx906,openmp-amdgcn-amd-amdhsa--gfx908 -input=%t.o -input=%t.tgt1 -input=%t.tgt2 -output=%t.simple.bundle
+// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa--gfx906,openmp-amdgcn-amd-amdhsa--gfx908 -input=%t.o -input=%t.tgt1 -input=%t.tgt2 -output=%t.simple.bundle
 // RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa--gfx906:sramecc+:xnack+,openmp-amdgcn-amd-amdhsa--gfx908:sramecc+:xnack+ -inputs=%t.o,%t.tgt1,%t.tgt1 -outputs=%t.targetID1.bundle
 // RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa--gfx906:sramecc+:xnack-,openmp-amdgcn-amd-amdhsa--gfx908:sramecc+:xnack- -inputs=%t.o,%t.tgt1,%t.tgt1 -outputs=%t.targetID2.bundle
 // RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa--gfx906:xnack-,openmp-amdgcn-amd-amdhsa--gfx908:xnack- -inputs=%t.o,%t.tgt1,%t.tgt1 -outputs=%t.targetID3.bundle
 // RUN: llvm-ar cr %t.input-archive.a %t.simple.bundle %t.targetID1.bundle %t.targetID2.bundle %t.targetID3.bundle
 
 // Tests to check compatibility between Bundle Entry ID formats i.e. between presence/absence of extra hyphen in case of missing environment field
-// RUN: clang-offload-bundler -unbundle -type=a -targets=openmp-amdgcn-amd-amdhsa--gfx906,openmp-amdgcn-amd-amdhsa-gfx908 -input=%t.input-archive.a -output=%t-archive-gfx906-simple.a -output=%t-archive-gfx908-simple.a -debug-only=CodeObjectCompatibility 2>&1 | FileCheck %s -check-prefix=BUNDLECOMPATIBILITY
-// BUNDLECOMPATIBILITY: Compatible: Exact match:        [CodeObject: openmp-amdgcn-amd-amdhsa--gfx906]  :       [Target: openmp-amdgcn-amd-amdhsa--gfx906]
-// BUNDLECOMPATIBILITY: Compatible: Exact match:        [CodeObject: openmp-amdgcn-amd-amdhsa--gfx908]  :       [Target: openmp-amdgcn-amd-amdhsa--gfx908]
+// RUN: clang-offload-bundler -unbundle -type=a -targets=openmp-amdgcn-amd-amdhsa--gfx906,openmp-amdgcn-amd-amdhsa--gfx908 -input=%t.input-archive.a -output=%t-archive-gfx906-simple.a -output=%t-archive-gfx908-simple.a -debug-only=CodeObjectCompatibility 2>&1 | FileCheck %s -check-prefix=BUNDLECOMPATIBILITY
+// BUNDLECOMPATIBILITY: Compatible: Exact match:        [CodeObject: openmp-amdgcn-amd-amdhsa-unknown-gfx906]  :       [Target: openmp-amdgcn-amd-amdhsa-unknown-gfx906]
+// BUNDLECOMPATIBILITY: Compatible: Exact match:        [CodeObject: openmp-amdgcn-amd-amdhsa-unknown-gfx908]  :       [Target: openmp-amdgcn-amd-amdhsa-unknown-gfx908]
 
-// RUN: clang-offload-bundler -unbundle -type=a -targets=hip-amdgcn-amd-amdhsa--gfx906,hipv4-amdgcn-amd-amdhsa-gfx908 -input=%t.input-archive.a -output=%t-hip-archive-gfx906-simple.a -output=%t-hipv4-archive-gfx908-simple.a -hip-openmp-compatible -debug-only=CodeObjectCompatibility 2>&1 | FileCheck %s -check-prefix=HIPOpenMPCOMPATIBILITY
-// HIPOpenMPCOMPATIBILITY: Compatible: Target IDs are compatible        [CodeObject: openmp-amdgcn-amd-amdhsa--gfx906]  :       [Target: hip-amdgcn-amd-amdhsa--gfx906]
-// HIPOpenMPCOMPATIBILITY: Compatible: Target IDs are compatible        [CodeObject: openmp-amdgcn-amd-amdhsa--gfx908]  :       [Target: hipv4-amdgcn-amd-amdhsa--gfx908]
+// RUN: clang-offload-bundler -unbundle -type=a -targets=hip-amdgcn-amd-amdhsa--gfx906,hipv4-amdgcn-amd-amdhsa--gfx908 -input=%t.input-archive.a -output=%t-hip-archive-gfx906-simple.a -output=%t-hipv4-archive-gfx908-simple.a -hip-openmp-compatible -debug-only=CodeObjectCompatibility 2>&1 | FileCheck %s -check-prefix=HIPOpenMPCOMPATIBILITY
+// HIPOpenMPCOMPATIBILITY: Compatible: Target IDs are compatible        [CodeObject: openmp-amdgcn-amd-amdhsa-unknown-gfx906]  :       [Target: hip-amdgcn-amd-amdhsa-unknown-gfx906]
+// HIPOpenMPCOMPATIBILITY: Compatible: Target IDs are compatible        [CodeObject: openmp-amdgcn-amd-amdhsa-unknown-gfx908]  :       [Target: hipv4-amdgcn-amd-amdhsa-unknown-gfx908]
 
 // Some code so that we can create a binary out of this file.
 int A = 0;
diff --git a/clang/test/Driver/clang-offload-bundler-standardize.c b/clang/test/Driver/clang-offload-bundler-standardize.c
index 52f5ea038e47b8..6550b8051987ab 100644
--- a/clang/test/Driver/clang-offload-bundler-standardize.c
+++ b/clang/test/Driver/clang-offload-bundler-standardize.c
@@ -16,19 +16,18 @@
 // Check code object compatibility for archive unbundling
 //
 // Create an object bundle with and without env fields
-// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,hip-amdgcn-amd-amdhsa-gfx906,hip-amdgcn-amd-amdhsa-gfx908 -input=%t.o -input=%t.tgt1 -input=%t.tgt2 -output=%t.bundle.no.env
-// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple-,hip-amdgcn-amd-amdhsa--gfx906,hip-amdgcn-amd-amdhsa--gfx908 -input=%t.o -input=%t.tgt1 -input=%t.tgt2 -output=%t.bundle.env
+// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,hip-amdgcn-amd-amdhsa--gfx906,hip-amdgcn-amd-amdhsa--gfx908 -input=%t.o -input=%t.tgt1 -input=%t.tgt2 -output=%t.bundle.no.env
 
 
 // Unbundle bundle.no.env while providing targets with env
 // RUN: clang-offload-bundler -unbundle -type=o -targets=hip-amdgcn-amd-amdhsa--gfx906,hip-amdgcn-amd-amdhsa--gfx908 -input=%t.bundle.no.env -output=%t-hip-amdgcn-amd-amdhsa--gfx906.bc -output=%t-hip-amdgcn-amd-amdhsa--gfx908.bc -debug-only=CodeObjectCompatibility 2>&1 | FileCheck %s -check-prefix=BUNDLE-NO-ENV
-// BUNDLE-NO-ENV: Compatible: Exact match: [CodeObject: hip-amdgcn-amd-amdhsa--gfx906] : [Target: hip-amdgcn-amd-amdhsa--gfx906]
-// BUNDLE-NO-ENV: Compatible: Exact match: [CodeObject: hip-amdgcn-amd-amdhsa--gfx908] : [Target: hip-amdgcn-amd-amdhsa--gfx908]
+// BUNDLE-NO-ENV: Compatible: Exact match: [CodeObject: hip-amdgcn-amd-amdhsa-unknown-gfx906] : [Target: hip-amdgcn-amd-amdhsa-unknown-gfx906]
+// BUNDLE-NO-ENV: Compatible: Exact match: [CodeObject: hip-amdgcn-amd-amdhsa-unknown-gfx908] : [Target: hip-amdgcn-amd-amdhsa-unknown-gfx908]
 
 // Unbundle bundle.env while providing targets with no env
-// RUN: clang-offload-bundler -unbundle -type=o -targets=hip-amdgcn-amd-amdhsa-gfx906,hip-amdgcn-amd-amdhsa-gfx908 -input=%t.bundle.env -output=%t-hip-amdgcn-amd-amdhsa-gfx906.bc -output=%t-hip-amdgcn-amd-amdhsa-gfx908.bc -debug-only=CodeObjectCompatibility 2>&1 | FileCheck %s -check-prefix=BUNDLE-ENV
-// BUNDLE-ENV: Compatible: Exact match: [CodeObject: hip-amdgcn-amd-amdhsa--gfx906] : [Target: hip-amdgcn-amd-amdhsa--gfx906]
-// BUNDLE-ENV: Compatible: Exact match: [CodeObject: hip-amdgcn-amd-amdhsa--gfx908] : [Target: hip-amdgcn-amd-amdhsa--gfx908]
+// RUN: clang-offload-bundler -unbundle -type=o -targets=hip-amdgcn-amd-amdhsa--gfx906,hip-amdgcn-amd-amdhsa--gfx908 -input=%t.bundle.env -output=%t-hip-amdgcn-amd-amdhsa--gfx906.bc -output=%t-hip-amdgcn-amd-amdhsa--gfx908.bc -debug-only=CodeObjectCompatibility 2>&1 | FileCheck %s -check-prefix=BUNDLE-ENV
+// BUNDLE-ENV: Compatible: Exact match: [CodeObject: hip-amdgcn-amd-amdhsa-unknown-gfx906] : [Target: hip-amdgcn-amd-amdhsa-unknown-gfx906]
+// BUNDLE-ENV: Compatible: Exact match: [CodeObject: hip-amdgcn-amd-amdhsa-unknown-gfx908] : [Target: hip-amdgcn-amd-amdhsa-unknown-gfx908]
 
 // Some code so that we can create a binary out of this file.
 int A = 0;
diff --git a/clang/test/Driver/clang-offload-bundler-zstd.c b/clang/test/Driver/clang-offload-bundler-zstd.c
index 667d9554daec71..60521628a4a2ab 100644
--- a/clang/test/Driver/clang-offload-bundler-zstd.c
+++ b/clang/test/Driver/clang-offload-bundler-zstd.c
@@ -38,8 +38,8 @@
 // CHECK: Decompression method: zstd
 // CHECK: Hashes match: Yes
 // NOHOST-NOT: host-
-// NOHOST-DAG: hip-amdgcn-amd-amdhsa--gfx900
-// NOHOST-DAG: hip-amdgcn-amd-amdhsa--gfx906
+// NOHOST-DAG: hip-amdgcn-amd-amdhsa-unknown-gfx900
+// NOHOST-DAG: hip-amdgcn-amd-amdhsa-unknown-gfx906
 //
 
 // Check -compression-level= option
@@ -77,10 +77,10 @@
 // RUN:   -output=%t.hip_900.a -output=%t.hip_906.a -input=%t.hip_archive.a
 // RUN: llvm-ar t %t.hip_900.a | FileCheck -check-prefix=HIP-AR-900 %s
 // RUN: llvm-ar t %t.hip_906.a | FileCheck -check-prefix=HIP-AR-906 %s
-// HIP-AR-900-DAG: hip_bundle1-hip-amdgcn-amd-amdhsa--gfx900
-// HIP-AR-900-DAG: hip_bundle2-hip-amdgcn-amd-amdhsa--gfx900
-// HIP-AR-906-DAG: hip_bundle1-hip-amdgcn-amd-amdhsa--gfx906
-// HIP-AR-906-DAG: hip_bundle2-hip-amdgcn-amd-amdhsa--gfx906
+// HIP-AR-900-DAG: hip_bundle1-hip-amdgcn-amd-amdhsa-unknown-gfx900
+// HIP-AR-900-DAG: hip_bundle2-hip-amdgcn-amd-amdhsa-unknown-gfx900
+// HIP-AR-906-DAG: hip_bundle1-hip-amdgcn-amd-amdhsa-unknown-gfx906
+// HIP-AR-906-DAG: hip_bundle2-hip-amdgcn-amd-amdhsa-unknown-gfx906
 
 // Some code so that we can create a binary out of this file.
 int A = 0;
diff --git a/clang/test/Driver/clang-offload-bundler.c b/clang/test/Driver/clang-offload-bundler.c
index 1909ff2d71d03c..8757beab21beea 100644
--- a/clang/test/Driver/clang-offload-bundler.c
+++ b/clang/test/Driver/clang-offload-bundler.c
@@ -116,7 +116,7 @@
 // RUN: not clang-offload-bundler -type=i -targets=host-powerpc64le-ibm-linux-gnu,openmp-powerpc64le-ibm-linux-gnu,xpenmp-x86_xx-pc-linux-gnu -input=%t.i -input=%t.tgt1 -input=%t.tgt2 -output=%t.bundle.i 2>&1 | FileCheck %s --check-prefix CK-ERR8B
 // CK-ERR8B: error: invalid target 'xpenmp-x86_xx-pc-linux-gnu', unknown offloading kind 'xpenmp', unknown target triple 'x86_xx-pc-linux-gnu'
 
-// RUN: not clang-offload-bundler -type=i -targets=openmp-powerpc64le-linux,openmp-powerpc64le-ibm-linux-gnu,openmp-x86_64-pc-linux-gnu -input=%t.i -input=%t.tgt1 -input=%t.tgt2 -output=%t.bundle.i 2>&1 | FileCheck %s --check-prefix CK-ERR9A
+// RUN: not clang-offload-bundler -type=i -targets=openmp-powerpc64le-ibm-linux-gnu,openmp-x86_64-pc-linux-gnu -input=%t.i -input=%t.tgt1 -output=%t.bundle.i 2>&1 | FileCheck %s --check-prefix CK-ERR9A
 // CK-ERR9A: error: expecting exactly one host target but got 0
 
 // RUN: not clang-offload-bundler -type=i -targets=host-%itanium_abi_triple,host-%itanium_abi_triple,openmp-x86_64-pc-linux-gnu -input=%t.i -input=%t.tgt1 -input=%t.tgt2 -output=%t.bundle.i 2>&1 | FileCheck %s --check-prefix CK-ERR9B
@@ -364,15 +364,15 @@
 // RUN: not clang-offload-bundler -type=bc -input=%t.hip.bundle.bc -output=%t.tmp.bc -output=%t.tmp2.bc -unbundle \
 // RUN:   -targets=hip-amdgcn-amd-amdhsa--gfx906,hip-amdgcn-amd-amdhsa--gfx900 \
 // RUN:   2>&1 | FileCheck -check-prefix=MISS1 %s
-// MISS1: error: Can't find bundles for hip-amdgcn-amd-amdhsa--gfx906
+// MISS1: error: Can't find bundles for hip-amdgcn-amd-amdhsa-unknown-gfx906
 // RUN: not clang-offload-bundler -type=bc -input=%t.hip.bundle.bc -output=%t.tmp.bc -output=%t.tmp2.bc -unbundle \
 // RUN:   -targets=hip-amdgcn-amd-amdhsa--gfx906,hip-amdgcn-amd-amdhsa--gfx803 \
 // RUN:   2>&1 | FileCheck -check-prefix=MISS2 %s
-// MISS2: error: Can't find bundles for hip-amdgcn-amd-amdhsa--gfx803 and hip-amdgcn-amd-amdhsa--gfx906
+// MISS2: error: Can't find bundles for hip-amdgcn-amd-amdhsa-unknown-gfx803 and hip-amdgcn-amd-amdhsa-unknown-gfx906
 // RUN: not clang-offload-bundler -type=bc -input=%t.hip.bundle.bc -output=%t.tmp.bc -output=%t.tmp2.bc -output=%t.tmp3.bc -unbundle \
 // RUN:   -targets=hip-amdgcn-amd-amdhsa--gfx906,hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx1010 \
 // RUN:   2>&1 | FileCheck -check-prefix=MISS3 %s
-// MISS3: error: Can't find bundles for hip-amdgcn-amd-amdhsa--gfx1010, hip-amdgcn-amd-amdhsa--gfx803, and hip-amdgcn-amd-amdhsa--gfx906
+// MISS3: error: Can't find bundles for hip-amdgcn-amd-amdhsa-unknown-gfx1010, hip-amdgcn-amd-amdhsa-unknown-gfx803, and hip-amdgcn-amd-amdhsa-unknown-gfx906
 
 //
 // Check error due to duplicate targets
@@ -422,10 +422,10 @@
 // RUN:   -output=%T/hip_900.a -output=%T/hip_906.a -input=%T/hip_archive.a
 // RUN: llvm-ar t %T/hip_900.a | FileCheck -check-prefix=HIP-AR-900 %s
 // RUN: llvm-ar t %T/hip_906.a | FileCheck -check-prefix=HIP-AR-906 %s
-// HIP-AR-900-DAG: hip_bundle1-hip-amdgcn-amd-amdhsa--gfx900
-// HIP-AR-900-DAG: hip_bundle2-hip-amdgcn-amd-amdhsa--gfx900
-// HIP-AR-906-DAG: hip_bundle1-hip-amdgcn-amd-amdhsa--gfx906
-// HIP-AR-906-DAG: hip_bundle2-hip-amdgcn-amd-amdhsa--gfx906
+// HIP-AR-900-DAG: hip_bundle1-hip-amdgcn-amd-amdhsa-unknown-gfx900
+// HIP-AR-900-DAG: hip_bundle2-hip-amdgcn-amd-amdhsa-unknown-gfx900
+// HIP-AR-906-DAG: hip_bundle1-hip-amdgcn-amd-amdhsa-unknown-gfx906
+// HIP-AR-906-DAG: hip_bundle2-hip-amdgcn-amd-amdhsa-unknown-gfx906
 
 //
 // Check unbundling archive for host target
@@ -469,8 +469,8 @@
 // RUN: diff %t.tgt2 %t.res.tgt2
 //
 // NOHOST-NOT: host-
-// NOHOST-DAG: hip-amdgcn-amd-amdhsa--gfx900
-// NOHOST-DAG: hip-amdgcn-amd-amdhsa--gfx906
+// NOHOST-DAG: hip-amdgcn-amd-amdhsa-unknown-gfx900
+// NOHOST-DAG: hip-amdgcn-amd-amdhsa-unknown-gfx906
 
 //
 // Check bundling ID compatibility for HIP.

@shiltian
Copy link
Contributor Author

I suppose this change is quite intrusive for any external use of clang-offload-bundler. It's gonna require all the targets listed in -targets have to be in the normalized form (4-field).

@shiltian shiltian force-pushed the users/shiltian/triple-with-environment branch from 95332e0 to 71bad77 Compare January 12, 2025 15:49
@shiltian shiltian force-pushed the users/shiltian/rework-offloadtargetinfo-ctor branch from 2fc9d92 to 1b69731 Compare January 12, 2025 15:49
Base automatically changed from users/shiltian/triple-with-environment to main January 12, 2025 15:51
@shiltian shiltian force-pushed the users/shiltian/rework-offloadtargetinfo-ctor branch 2 times, most recently from b1dbe4f to c58bbd1 Compare January 12, 2025 15:52
@shiltian shiltian changed the title [OffloadBundler] Rework the ctor of OffloadTargetInfo to support generic target [OffloadBundler] Rework the ctor of OffloadTargetInfo to support AMDGPU's generic target Jan 12, 2025
@shiltian shiltian force-pushed the users/shiltian/rework-offloadtargetinfo-ctor branch 2 times, most recently from 2fdbbac to 26ebaf0 Compare January 12, 2025 23:02
@yxsamliu
Copy link
Collaborator

Did you try this patch with internal PSDB? Does it break any existing HIP apps or libraries?

@shiltian
Copy link
Contributor Author

Did you try this patch with internal PSDB? Does it break any existing HIP apps or libraries?

Will make a PR shortly after the migration.

@lamb-j
Copy link
Contributor

lamb-j commented Jan 13, 2025

Is there a way for us to still support 3-field triples and generic targets?

In the past we worked to allow 3-field triples as bundler inputs. I would be surprised if adding a 4-field restriction didn't break something internally.

@shiltian
Copy link
Contributor Author

shiltian commented Jan 13, 2025

Is there a way for us to still support 3-field triples and generic targets?

I can't come up with a better solution because we don't know how many - could exist in the (generic) target id. Based on current situation, it could be 0, 1, and 2. With that, fixating the "known" part (aka triple) would be reasonable. Do you have any suggestions?

An alternative would be to change the syntax of target string to <kind>:<triple>:<target id>:<target features>. There is no ambiguity in this way, but I'd assume that is a more intrusive change.

In the past we worked to allow 3-field triples as bundler inputs. I would be surprised if adding a 4-field restriction didn't break something internally.

I don't know if there is any project using clang-offload-bundler directly. If there is, and we decide to enforce it, we will have to update them as well.

@lamb-j
Copy link
Contributor

lamb-j commented Jan 13, 2025

https://github.com/search?q=org%3AROCm%20clang-offload-bundler&type=code

So at least a few direct users this may break things for. There are also users that call the OffloadBundler API instead of directly calling clang-offload-bundler that this may impact.

Could we assume "gfx * -generic" is a single Target ID, and check for/parse that separately before doing rsplit on "-"?

@shiltian
Copy link
Contributor Author

Could we assume "gfx * -generic" is a single Target ID, and check for/parse that separately before doing rsplit on "-"?

I was thinking about it the other day but eventually gave up on it, because I want to keep this part as generic as possible instead of adding AMD special sauce here. I prefer not to do something like pattern matching stuff about gfx.

@shiltian
Copy link
Contributor Author

So at least a few direct users this may break things for. There are also users that call the OffloadBundler API instead of directly calling clang-offload-bundler that this may impact.

For the API usage there will be no issue because we enforce it in the code already. The only part needs to be updated is those binary user.

@shiltian shiltian force-pushed the users/shiltian/rework-offloadtargetinfo-ctor branch from 26ebaf0 to f5e2dd6 Compare January 14, 2025 16:59
@shiltian shiltian changed the base branch from main to users/shiltian/normalize-with-num-fields January 14, 2025 16:59
@jhuber6
Copy link
Contributor

jhuber6 commented Jan 14, 2025

Somewhere for the linker wrapper I just checked if the triple was recognized, you could probably just take strings after the - until it stops working.

@yxsamliu
Copy link
Collaborator

yxsamliu commented Jan 14, 2025

Somewhere for the linker wrapper I just checked if the triple was recognized, you could probably just take strings after the - until it stops working.

+1

It would be bad user experience to break existing app. It is low possibility to have env+cpu to be a valid cpu. So you could assume env exist first, if fails to parse remaining as cpu, then recoil to assume no env and parse the remaining all as cpu.

@shiltian
Copy link
Contributor Author

First of all, I don't think it can fix the issue in a robust way. Second, generic is already a valid target/cpu/offload target.

Unless we do something like, if the last part is generic, we keep looking forward until we can construct a valid target. That has no difference than doing pattern matching, which is back to my previous point about AMD special sauce.

If we assert that the offload bundler is an AMD only thing (which TBH really looks like so), I'm fine with adding a bunch of more special sauce here.

@yxsamliu
Copy link
Collaborator

First of all, I don't think it can fix the issue in a robust way. Second, generic is already a valid target/cpu/offload target.

Unless we do something like, if the last part is generic, we keep looking forward until we can construct a valid target. That has no difference than doing pattern matching, which is back to my previous point about AMD special sauce.

If we assert that the offload bundler is an AMD only thing (which TBH really looks like so), I'm fine with adding a bunch of more special sauce here.

offload-bundler is not an AMD only thing. At least HIPSPRV toolchain uses it, which is Intel GPU.

Still, I think it is possible to make it generic with minor assumption. Let's say you are now about to parsing the final part of the target ID string which may be either "env-cpu" or "cpu" without env. clang has a function getCanonicalProcessorName() which can check whether a string is a valid cpu name. Just pass the remaining string to it. If true, that means the remaining is a cpu, without env string. Otherwise, assuming there is an env string that contains no "-" and split it.

@shiltian
Copy link
Contributor Author

Still, I think it is possible to make it generic with minor assumption. Let's say you are now about to parsing the final part of the target ID string which may be either "env-cpu" or "cpu" without env.

This is not actually the issue. The issue is when the cpu is a generic target, such as gfx10-3-generic. By the current logic, the target id after split is generic, which is totally a valid one, and leave the rest with things like hip-amd-amdhsa-amd-gfx10-3.

@yxsamliu
Copy link
Collaborator

Still, I think it is possible to make it generic with minor assumption. Let's say you are now about to parsing the final part of the target ID string which may be either "env-cpu" or "cpu" without env.

This is not actually the issue. The issue is when the cpu is a generic target, such as gfx10-3-generic. By the current logic, the target id after split is generic, which is totally a valid one, and leave the rest with things like hip-amd-amdhsa-amd-gfx10-3.

That is probably due to this line https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/OffloadBundler.cpp#L88

It assumes there is no '-' in GPU name.

we could add a loop. If that line fails, we will split at the second '-' from right.

@yxsamliu
Copy link
Collaborator

Still, I think it is possible to make it generic with minor assumption. Let's say you are now about to parsing the final part of the target ID string which may be either "env-cpu" or "cpu" without env.

This is not actually the issue. The issue is when the cpu is a generic target, such as gfx10-3-generic. By the current logic, the target id after split is generic, which is totally a valid one, and leave the rest with things like hip-amd-amdhsa-amd-gfx10-3.

That is probably due to this line https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/OffloadBundler.cpp#L88

It assumes there is no '-' in GPU name.

we could add a loop. If that line fails, we will split at the second '-' from right.

Ok I get your point. since generic is a valid GPU name. it will stop there.

@yxsamliu
Copy link
Collaborator

how about assuming the strict triple format first, this will make the generic GPU arch work with strict triple.

If the first assumption fails, then fall back to the legacy parsing, that is, assuming no '-' in GPU arch and split at the right most '-'. This way, the old target ID string with non-strict triple still works.

@shiltian shiltian force-pushed the users/shiltian/normalize-with-num-fields branch from c7c5b74 to bd285ed Compare January 14, 2025 19:17
@shiltian shiltian force-pushed the users/shiltian/rework-offloadtargetinfo-ctor branch from c2588fe to e7eac39 Compare January 14, 2025 19:17
@shiltian
Copy link
Contributor Author

shiltian commented Jan 14, 2025

how about assuming the strict triple format first, this will make the generic GPU arch work with strict triple.

That doesn't work if someone write a 3-tuple with a generic target because neither path can handle that.

@yxsamliu
Copy link
Collaborator

how about assuming the strict triple format first, this will make the generic GPU arch work with strict triple.

That doesn't work if someone write a 3-tuple with a generic target because neither path can handle that.

That's true. But your current approach does not work for that use case either, right? For the generic target, we request the 4-tuple. For the old target, we allow the 3-tuple to avoid regressions.

@shiltian
Copy link
Contributor Author

shiltian commented Jan 15, 2025

For the generic target, we request the 4-tuple.

I don't think we can enforce that.

For the old target, we allow the 3-tuple to avoid regressions.

I'm gonna submit PR to all the related projects (e.g. https://github.com/search?q=org%3AROCm%20clang-offload-bundler&type=code) to use 4-tuple.

I've created issues in all the projects mentioned above about the potential intrusive changes in clang-offload-bundler.

Base automatically changed from users/shiltian/normalize-with-num-fields to main January 15, 2025 01:12
@shiltian shiltian force-pushed the users/shiltian/rework-offloadtargetinfo-ctor branch 4 times, most recently from 2055d82 to 1b13d14 Compare January 16, 2025 16:59
@lamb-j
Copy link
Contributor

lamb-j commented Jan 17, 2025

Could we possibly do a phase-out approach?

For COV5 and earlier we keep the current strategy, allowing both 3 and 4 field triples

For COV6 (including generics), we enforce the 4 field triples

This could help us not break things internally, and then as tools and APIs update to V6, they can also update to 4-field triples

@shiltian
Copy link
Contributor Author

shiltian commented Jan 17, 2025

Isn't offload bundler a target agnostic tool?

This could help us not break things internally, and then as tools and APIs update to V6, they can also update to 4-field triples

PSDB has full green (though some tests were skipped due to PSDB issue but I understand this doesn't mean we wouldn't break anything).

I also created issues to all repos that use clang-offload-bundler binary directly. Most of them have responded and started using 4-tuple now. IMHO if we tell users, hey, if you do COV5, you can use 3- or 4-tuple, but if you do COV6, you have to use 4-tuple, it's gonna me way more confusing than directly enforcing them. In addition, internally we have switched to COV6 by default already for staging branch, where this PR is gonna go, and will be soon in mainline.

@shiltian shiltian force-pushed the users/shiltian/rework-offloadtargetinfo-ctor branch from 1b13d14 to 0075687 Compare February 10, 2025 16:19
@jhuber6
Copy link
Contributor

jhuber6 commented Feb 10, 2025

Isn't offload bundler a target agnostic tool?

Would it break anything if we just assumed it was always amdgcn-amd-amdhsa for the triple? It's supposed to be generic, but right now only HIP uses it AFAICT.

…neric target

The current parsing of target string assumes to be in a form of
`kind-triple-targetid:feature`, such as
`hipv4-amdgcn-amd-amdhsa-gfx1030:+xnack`. Specifically, the target id does not
contain any `-`, which is not the case for generic target. Also, a generic
target may contain one or more `-`, such as `gfx10-3-generic` and
`gfx12-generic`. As a result, we can no longer depend on `rstrip` to get things
work right. This patch reworks the logic to parse the target string to make it
more robust, as well as supporting generic target.
@shiltian shiltian force-pushed the users/shiltian/rework-offloadtargetinfo-ctor branch from 0075687 to 84b10d4 Compare February 15, 2025 00:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category llvm-lit testing-tools
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants