-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OffloadBundler] Rework the ctor of OffloadTargetInfo
to support AMDGPU's generic target
#122629
base: main
Are you sure you want to change the base?
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
@llvm/pr-subscribers-testing-tools @llvm/pr-subscribers-clang-driver Author: Shilei Tian (shiltian) ChangesThe current parsing of target string assumes to be in a form of Full diff: https://github.com/llvm/llvm-project/pull/122629.diff 6 Files Affected:
diff --git a/clang/docs/ClangOffloadBundler.rst b/clang/docs/ClangOffloadBundler.rst
index 3c241027d405ca..25214c2ea6a4e1 100644
--- a/clang/docs/ClangOffloadBundler.rst
+++ b/clang/docs/ClangOffloadBundler.rst
@@ -266,15 +266,14 @@ without differentiation based on offload kind.
The target triple of the code object. See `Target Triple
<https://clang.llvm.org/docs/CrossCompilation.html#target-triple>`_.
- The bundler accepts target triples with or without the optional environment
- field:
+ LLVM target triples can be with or without the optional environment field:
``<arch><sub>-<vendor>-<sys>``, or
``<arch><sub>-<vendor>-<sys>-<env>``
However, in order to standardize outputs for tools that consume bitcode
- bundles, bundles written by the bundler internally use only the 4-field
- target triple:
+ bundles, the bundler only accepts target triples with the 4-field target
+ triple:
``<arch><sub>-<vendor>-<sys>-<env>``
diff --git a/clang/lib/Driver/OffloadBundler.cpp b/clang/lib/Driver/OffloadBundler.cpp
index 2d6bdff0393be5..474b5a0fa4148f 100644
--- a/clang/lib/Driver/OffloadBundler.cpp
+++ b/clang/lib/Driver/OffloadBundler.cpp
@@ -84,31 +84,19 @@ OffloadTargetInfo::OffloadTargetInfo(const StringRef Target,
: BundlerConfig(BC) {
// TODO: Add error checking from ClangOffloadBundler.cpp
- auto TargetFeatures = Target.split(':');
- auto TripleOrGPU = TargetFeatures.first.rsplit('-');
-
- if (clang::StringToOffloadArch(TripleOrGPU.second) !=
- clang::OffloadArch::UNKNOWN) {
- auto KindTriple = TripleOrGPU.first.split('-');
- this->OffloadKind = KindTriple.first;
-
- // Enforce optional env field to standardize bundles
- llvm::Triple t = llvm::Triple(KindTriple.second);
- this->Triple = llvm::Triple(t.getArchName(), t.getVendorName(),
- t.getOSName(), t.getEnvironmentName());
-
- this->TargetID = Target.substr(Target.find(TripleOrGPU.second));
- } else {
- auto KindTriple = TargetFeatures.first.split('-');
- this->OffloadKind = KindTriple.first;
-
- // Enforce optional env field to standardize bundles
- llvm::Triple t = llvm::Triple(KindTriple.second);
- this->Triple = llvm::Triple(t.getArchName(), t.getVendorName(),
- t.getOSName(), t.getEnvironmentName());
-
- this->TargetID = "";
- }
+ // <kind>-<triple>[-<target id>]
+ // <tiple> := <arch>-<vendor>-<os>-<env>
+ SmallVector<StringRef, 6> Components;
+ Target.split(':').first.split(Components, '-', /*MaxSplit=*/5);
+ assert((Components.size() == 5 || Components.size() == 6) &&
+ "malformed target string");
+
+ this->OffloadKind = Components.front();
+ this->TargetID = Components.size() == 6 ? Components.back() : "";
+ ArrayRef<StringRef> TripleSlice{&Components[1], /*length=*/4};
+ llvm::Triple T = llvm::Triple(llvm::join(TripleSlice, "-"));
+ this->Triple = llvm::Triple(T.getArchName(), T.getVendorName(), T.getOSName(),
+ T.getEnvironmentName());
}
bool OffloadTargetInfo::hasHostKind() const {
@@ -148,7 +136,7 @@ bool OffloadTargetInfo::operator==(const OffloadTargetInfo &Target) const {
}
std::string OffloadTargetInfo::str() const {
- return Twine(OffloadKind + "-" + Triple.str() + "-" + TargetID).str();
+ return Twine(OffloadKind + "-" + Triple.normalize() + "-" + TargetID).str();
}
static StringRef getDeviceFileExtension(StringRef Device,
diff --git a/clang/test/Driver/clang-offload-bundler-asserts-on.c b/clang/test/Driver/clang-offload-bundler-asserts-on.c
index 55060c2c42e734..eec30674107b25 100644
--- a/clang/test/Driver/clang-offload-bundler-asserts-on.c
+++ b/clang/test/Driver/clang-offload-bundler-asserts-on.c
@@ -15,20 +15,20 @@
// Check code object compatibility for archive unbundling
//
// Create few code object bundles and archive them to create an input archive
-// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa-gfx906,openmp-amdgcn-amd-amdhsa--gfx908 -input=%t.o -input=%t.tgt1 -input=%t.tgt2 -output=%t.simple.bundle
+// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa--gfx906,openmp-amdgcn-amd-amdhsa--gfx908 -input=%t.o -input=%t.tgt1 -input=%t.tgt2 -output=%t.simple.bundle
// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa--gfx906:sramecc+:xnack+,openmp-amdgcn-amd-amdhsa--gfx908:sramecc+:xnack+ -inputs=%t.o,%t.tgt1,%t.tgt1 -outputs=%t.targetID1.bundle
// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa--gfx906:sramecc+:xnack-,openmp-amdgcn-amd-amdhsa--gfx908:sramecc+:xnack- -inputs=%t.o,%t.tgt1,%t.tgt1 -outputs=%t.targetID2.bundle
// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa--gfx906:xnack-,openmp-amdgcn-amd-amdhsa--gfx908:xnack- -inputs=%t.o,%t.tgt1,%t.tgt1 -outputs=%t.targetID3.bundle
// RUN: llvm-ar cr %t.input-archive.a %t.simple.bundle %t.targetID1.bundle %t.targetID2.bundle %t.targetID3.bundle
// Tests to check compatibility between Bundle Entry ID formats i.e. between presence/absence of extra hyphen in case of missing environment field
-// RUN: clang-offload-bundler -unbundle -type=a -targets=openmp-amdgcn-amd-amdhsa--gfx906,openmp-amdgcn-amd-amdhsa-gfx908 -input=%t.input-archive.a -output=%t-archive-gfx906-simple.a -output=%t-archive-gfx908-simple.a -debug-only=CodeObjectCompatibility 2>&1 | FileCheck %s -check-prefix=BUNDLECOMPATIBILITY
-// BUNDLECOMPATIBILITY: Compatible: Exact match: [CodeObject: openmp-amdgcn-amd-amdhsa--gfx906] : [Target: openmp-amdgcn-amd-amdhsa--gfx906]
-// BUNDLECOMPATIBILITY: Compatible: Exact match: [CodeObject: openmp-amdgcn-amd-amdhsa--gfx908] : [Target: openmp-amdgcn-amd-amdhsa--gfx908]
+// RUN: clang-offload-bundler -unbundle -type=a -targets=openmp-amdgcn-amd-amdhsa--gfx906,openmp-amdgcn-amd-amdhsa--gfx908 -input=%t.input-archive.a -output=%t-archive-gfx906-simple.a -output=%t-archive-gfx908-simple.a -debug-only=CodeObjectCompatibility 2>&1 | FileCheck %s -check-prefix=BUNDLECOMPATIBILITY
+// BUNDLECOMPATIBILITY: Compatible: Exact match: [CodeObject: openmp-amdgcn-amd-amdhsa-unknown-gfx906] : [Target: openmp-amdgcn-amd-amdhsa-unknown-gfx906]
+// BUNDLECOMPATIBILITY: Compatible: Exact match: [CodeObject: openmp-amdgcn-amd-amdhsa-unknown-gfx908] : [Target: openmp-amdgcn-amd-amdhsa-unknown-gfx908]
-// RUN: clang-offload-bundler -unbundle -type=a -targets=hip-amdgcn-amd-amdhsa--gfx906,hipv4-amdgcn-amd-amdhsa-gfx908 -input=%t.input-archive.a -output=%t-hip-archive-gfx906-simple.a -output=%t-hipv4-archive-gfx908-simple.a -hip-openmp-compatible -debug-only=CodeObjectCompatibility 2>&1 | FileCheck %s -check-prefix=HIPOpenMPCOMPATIBILITY
-// HIPOpenMPCOMPATIBILITY: Compatible: Target IDs are compatible [CodeObject: openmp-amdgcn-amd-amdhsa--gfx906] : [Target: hip-amdgcn-amd-amdhsa--gfx906]
-// HIPOpenMPCOMPATIBILITY: Compatible: Target IDs are compatible [CodeObject: openmp-amdgcn-amd-amdhsa--gfx908] : [Target: hipv4-amdgcn-amd-amdhsa--gfx908]
+// RUN: clang-offload-bundler -unbundle -type=a -targets=hip-amdgcn-amd-amdhsa--gfx906,hipv4-amdgcn-amd-amdhsa--gfx908 -input=%t.input-archive.a -output=%t-hip-archive-gfx906-simple.a -output=%t-hipv4-archive-gfx908-simple.a -hip-openmp-compatible -debug-only=CodeObjectCompatibility 2>&1 | FileCheck %s -check-prefix=HIPOpenMPCOMPATIBILITY
+// HIPOpenMPCOMPATIBILITY: Compatible: Target IDs are compatible [CodeObject: openmp-amdgcn-amd-amdhsa-unknown-gfx906] : [Target: hip-amdgcn-amd-amdhsa-unknown-gfx906]
+// HIPOpenMPCOMPATIBILITY: Compatible: Target IDs are compatible [CodeObject: openmp-amdgcn-amd-amdhsa-unknown-gfx908] : [Target: hipv4-amdgcn-amd-amdhsa-unknown-gfx908]
// Some code so that we can create a binary out of this file.
int A = 0;
diff --git a/clang/test/Driver/clang-offload-bundler-standardize.c b/clang/test/Driver/clang-offload-bundler-standardize.c
index 52f5ea038e47b8..6550b8051987ab 100644
--- a/clang/test/Driver/clang-offload-bundler-standardize.c
+++ b/clang/test/Driver/clang-offload-bundler-standardize.c
@@ -16,19 +16,18 @@
// Check code object compatibility for archive unbundling
//
// Create an object bundle with and without env fields
-// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,hip-amdgcn-amd-amdhsa-gfx906,hip-amdgcn-amd-amdhsa-gfx908 -input=%t.o -input=%t.tgt1 -input=%t.tgt2 -output=%t.bundle.no.env
-// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple-,hip-amdgcn-amd-amdhsa--gfx906,hip-amdgcn-amd-amdhsa--gfx908 -input=%t.o -input=%t.tgt1 -input=%t.tgt2 -output=%t.bundle.env
+// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,hip-amdgcn-amd-amdhsa--gfx906,hip-amdgcn-amd-amdhsa--gfx908 -input=%t.o -input=%t.tgt1 -input=%t.tgt2 -output=%t.bundle.no.env
// Unbundle bundle.no.env while providing targets with env
// RUN: clang-offload-bundler -unbundle -type=o -targets=hip-amdgcn-amd-amdhsa--gfx906,hip-amdgcn-amd-amdhsa--gfx908 -input=%t.bundle.no.env -output=%t-hip-amdgcn-amd-amdhsa--gfx906.bc -output=%t-hip-amdgcn-amd-amdhsa--gfx908.bc -debug-only=CodeObjectCompatibility 2>&1 | FileCheck %s -check-prefix=BUNDLE-NO-ENV
-// BUNDLE-NO-ENV: Compatible: Exact match: [CodeObject: hip-amdgcn-amd-amdhsa--gfx906] : [Target: hip-amdgcn-amd-amdhsa--gfx906]
-// BUNDLE-NO-ENV: Compatible: Exact match: [CodeObject: hip-amdgcn-amd-amdhsa--gfx908] : [Target: hip-amdgcn-amd-amdhsa--gfx908]
+// BUNDLE-NO-ENV: Compatible: Exact match: [CodeObject: hip-amdgcn-amd-amdhsa-unknown-gfx906] : [Target: hip-amdgcn-amd-amdhsa-unknown-gfx906]
+// BUNDLE-NO-ENV: Compatible: Exact match: [CodeObject: hip-amdgcn-amd-amdhsa-unknown-gfx908] : [Target: hip-amdgcn-amd-amdhsa-unknown-gfx908]
// Unbundle bundle.env while providing targets with no env
-// RUN: clang-offload-bundler -unbundle -type=o -targets=hip-amdgcn-amd-amdhsa-gfx906,hip-amdgcn-amd-amdhsa-gfx908 -input=%t.bundle.env -output=%t-hip-amdgcn-amd-amdhsa-gfx906.bc -output=%t-hip-amdgcn-amd-amdhsa-gfx908.bc -debug-only=CodeObjectCompatibility 2>&1 | FileCheck %s -check-prefix=BUNDLE-ENV
-// BUNDLE-ENV: Compatible: Exact match: [CodeObject: hip-amdgcn-amd-amdhsa--gfx906] : [Target: hip-amdgcn-amd-amdhsa--gfx906]
-// BUNDLE-ENV: Compatible: Exact match: [CodeObject: hip-amdgcn-amd-amdhsa--gfx908] : [Target: hip-amdgcn-amd-amdhsa--gfx908]
+// RUN: clang-offload-bundler -unbundle -type=o -targets=hip-amdgcn-amd-amdhsa--gfx906,hip-amdgcn-amd-amdhsa--gfx908 -input=%t.bundle.env -output=%t-hip-amdgcn-amd-amdhsa--gfx906.bc -output=%t-hip-amdgcn-amd-amdhsa--gfx908.bc -debug-only=CodeObjectCompatibility 2>&1 | FileCheck %s -check-prefix=BUNDLE-ENV
+// BUNDLE-ENV: Compatible: Exact match: [CodeObject: hip-amdgcn-amd-amdhsa-unknown-gfx906] : [Target: hip-amdgcn-amd-amdhsa-unknown-gfx906]
+// BUNDLE-ENV: Compatible: Exact match: [CodeObject: hip-amdgcn-amd-amdhsa-unknown-gfx908] : [Target: hip-amdgcn-amd-amdhsa-unknown-gfx908]
// Some code so that we can create a binary out of this file.
int A = 0;
diff --git a/clang/test/Driver/clang-offload-bundler-zstd.c b/clang/test/Driver/clang-offload-bundler-zstd.c
index 667d9554daec71..60521628a4a2ab 100644
--- a/clang/test/Driver/clang-offload-bundler-zstd.c
+++ b/clang/test/Driver/clang-offload-bundler-zstd.c
@@ -38,8 +38,8 @@
// CHECK: Decompression method: zstd
// CHECK: Hashes match: Yes
// NOHOST-NOT: host-
-// NOHOST-DAG: hip-amdgcn-amd-amdhsa--gfx900
-// NOHOST-DAG: hip-amdgcn-amd-amdhsa--gfx906
+// NOHOST-DAG: hip-amdgcn-amd-amdhsa-unknown-gfx900
+// NOHOST-DAG: hip-amdgcn-amd-amdhsa-unknown-gfx906
//
// Check -compression-level= option
@@ -77,10 +77,10 @@
// RUN: -output=%t.hip_900.a -output=%t.hip_906.a -input=%t.hip_archive.a
// RUN: llvm-ar t %t.hip_900.a | FileCheck -check-prefix=HIP-AR-900 %s
// RUN: llvm-ar t %t.hip_906.a | FileCheck -check-prefix=HIP-AR-906 %s
-// HIP-AR-900-DAG: hip_bundle1-hip-amdgcn-amd-amdhsa--gfx900
-// HIP-AR-900-DAG: hip_bundle2-hip-amdgcn-amd-amdhsa--gfx900
-// HIP-AR-906-DAG: hip_bundle1-hip-amdgcn-amd-amdhsa--gfx906
-// HIP-AR-906-DAG: hip_bundle2-hip-amdgcn-amd-amdhsa--gfx906
+// HIP-AR-900-DAG: hip_bundle1-hip-amdgcn-amd-amdhsa-unknown-gfx900
+// HIP-AR-900-DAG: hip_bundle2-hip-amdgcn-amd-amdhsa-unknown-gfx900
+// HIP-AR-906-DAG: hip_bundle1-hip-amdgcn-amd-amdhsa-unknown-gfx906
+// HIP-AR-906-DAG: hip_bundle2-hip-amdgcn-amd-amdhsa-unknown-gfx906
// Some code so that we can create a binary out of this file.
int A = 0;
diff --git a/clang/test/Driver/clang-offload-bundler.c b/clang/test/Driver/clang-offload-bundler.c
index 1909ff2d71d03c..8757beab21beea 100644
--- a/clang/test/Driver/clang-offload-bundler.c
+++ b/clang/test/Driver/clang-offload-bundler.c
@@ -116,7 +116,7 @@
// RUN: not clang-offload-bundler -type=i -targets=host-powerpc64le-ibm-linux-gnu,openmp-powerpc64le-ibm-linux-gnu,xpenmp-x86_xx-pc-linux-gnu -input=%t.i -input=%t.tgt1 -input=%t.tgt2 -output=%t.bundle.i 2>&1 | FileCheck %s --check-prefix CK-ERR8B
// CK-ERR8B: error: invalid target 'xpenmp-x86_xx-pc-linux-gnu', unknown offloading kind 'xpenmp', unknown target triple 'x86_xx-pc-linux-gnu'
-// RUN: not clang-offload-bundler -type=i -targets=openmp-powerpc64le-linux,openmp-powerpc64le-ibm-linux-gnu,openmp-x86_64-pc-linux-gnu -input=%t.i -input=%t.tgt1 -input=%t.tgt2 -output=%t.bundle.i 2>&1 | FileCheck %s --check-prefix CK-ERR9A
+// RUN: not clang-offload-bundler -type=i -targets=openmp-powerpc64le-ibm-linux-gnu,openmp-x86_64-pc-linux-gnu -input=%t.i -input=%t.tgt1 -output=%t.bundle.i 2>&1 | FileCheck %s --check-prefix CK-ERR9A
// CK-ERR9A: error: expecting exactly one host target but got 0
// RUN: not clang-offload-bundler -type=i -targets=host-%itanium_abi_triple,host-%itanium_abi_triple,openmp-x86_64-pc-linux-gnu -input=%t.i -input=%t.tgt1 -input=%t.tgt2 -output=%t.bundle.i 2>&1 | FileCheck %s --check-prefix CK-ERR9B
@@ -364,15 +364,15 @@
// RUN: not clang-offload-bundler -type=bc -input=%t.hip.bundle.bc -output=%t.tmp.bc -output=%t.tmp2.bc -unbundle \
// RUN: -targets=hip-amdgcn-amd-amdhsa--gfx906,hip-amdgcn-amd-amdhsa--gfx900 \
// RUN: 2>&1 | FileCheck -check-prefix=MISS1 %s
-// MISS1: error: Can't find bundles for hip-amdgcn-amd-amdhsa--gfx906
+// MISS1: error: Can't find bundles for hip-amdgcn-amd-amdhsa-unknown-gfx906
// RUN: not clang-offload-bundler -type=bc -input=%t.hip.bundle.bc -output=%t.tmp.bc -output=%t.tmp2.bc -unbundle \
// RUN: -targets=hip-amdgcn-amd-amdhsa--gfx906,hip-amdgcn-amd-amdhsa--gfx803 \
// RUN: 2>&1 | FileCheck -check-prefix=MISS2 %s
-// MISS2: error: Can't find bundles for hip-amdgcn-amd-amdhsa--gfx803 and hip-amdgcn-amd-amdhsa--gfx906
+// MISS2: error: Can't find bundles for hip-amdgcn-amd-amdhsa-unknown-gfx803 and hip-amdgcn-amd-amdhsa-unknown-gfx906
// RUN: not clang-offload-bundler -type=bc -input=%t.hip.bundle.bc -output=%t.tmp.bc -output=%t.tmp2.bc -output=%t.tmp3.bc -unbundle \
// RUN: -targets=hip-amdgcn-amd-amdhsa--gfx906,hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx1010 \
// RUN: 2>&1 | FileCheck -check-prefix=MISS3 %s
-// MISS3: error: Can't find bundles for hip-amdgcn-amd-amdhsa--gfx1010, hip-amdgcn-amd-amdhsa--gfx803, and hip-amdgcn-amd-amdhsa--gfx906
+// MISS3: error: Can't find bundles for hip-amdgcn-amd-amdhsa-unknown-gfx1010, hip-amdgcn-amd-amdhsa-unknown-gfx803, and hip-amdgcn-amd-amdhsa-unknown-gfx906
//
// Check error due to duplicate targets
@@ -422,10 +422,10 @@
// RUN: -output=%T/hip_900.a -output=%T/hip_906.a -input=%T/hip_archive.a
// RUN: llvm-ar t %T/hip_900.a | FileCheck -check-prefix=HIP-AR-900 %s
// RUN: llvm-ar t %T/hip_906.a | FileCheck -check-prefix=HIP-AR-906 %s
-// HIP-AR-900-DAG: hip_bundle1-hip-amdgcn-amd-amdhsa--gfx900
-// HIP-AR-900-DAG: hip_bundle2-hip-amdgcn-amd-amdhsa--gfx900
-// HIP-AR-906-DAG: hip_bundle1-hip-amdgcn-amd-amdhsa--gfx906
-// HIP-AR-906-DAG: hip_bundle2-hip-amdgcn-amd-amdhsa--gfx906
+// HIP-AR-900-DAG: hip_bundle1-hip-amdgcn-amd-amdhsa-unknown-gfx900
+// HIP-AR-900-DAG: hip_bundle2-hip-amdgcn-amd-amdhsa-unknown-gfx900
+// HIP-AR-906-DAG: hip_bundle1-hip-amdgcn-amd-amdhsa-unknown-gfx906
+// HIP-AR-906-DAG: hip_bundle2-hip-amdgcn-amd-amdhsa-unknown-gfx906
//
// Check unbundling archive for host target
@@ -469,8 +469,8 @@
// RUN: diff %t.tgt2 %t.res.tgt2
//
// NOHOST-NOT: host-
-// NOHOST-DAG: hip-amdgcn-amd-amdhsa--gfx900
-// NOHOST-DAG: hip-amdgcn-amd-amdhsa--gfx906
+// NOHOST-DAG: hip-amdgcn-amd-amdhsa-unknown-gfx900
+// NOHOST-DAG: hip-amdgcn-amd-amdhsa-unknown-gfx906
//
// Check bundling ID compatibility for HIP.
|
I suppose this change is quite intrusive for any external use of |
95332e0
to
71bad77
Compare
2fc9d92
to
1b69731
Compare
b1dbe4f
to
c58bbd1
Compare
OffloadTargetInfo
to support generic targetOffloadTargetInfo
to support AMDGPU's generic target
2fdbbac
to
26ebaf0
Compare
Did you try this patch with internal PSDB? Does it break any existing HIP apps or libraries? |
Will make a PR shortly after the migration. |
Is there a way for us to still support 3-field triples and generic targets? In the past we worked to allow 3-field triples as bundler inputs. I would be surprised if adding a 4-field restriction didn't break something internally. |
I can't come up with a better solution because we don't know how many An alternative would be to change the syntax of target string to
I don't know if there is any project using |
https://github.com/search?q=org%3AROCm%20clang-offload-bundler&type=code So at least a few direct users this may break things for. There are also users that call the OffloadBundler API instead of directly calling clang-offload-bundler that this may impact. Could we assume "gfx * -generic" is a single Target ID, and check for/parse that separately before doing rsplit on "-"? |
I was thinking about it the other day but eventually gave up on it, because I want to keep this part as generic as possible instead of adding AMD special sauce here. I prefer not to do something like pattern matching stuff about |
For the API usage there will be no issue because we enforce it in the code already. The only part needs to be updated is those binary user. |
26ebaf0
to
f5e2dd6
Compare
f5e2dd6
to
c2588fe
Compare
Somewhere for the linker wrapper I just checked if the triple was recognized, you could probably just take strings after the |
+1 It would be bad user experience to break existing app. It is low possibility to have env+cpu to be a valid cpu. So you could assume env exist first, if fails to parse remaining as cpu, then recoil to assume no env and parse the remaining all as cpu. |
First of all, I don't think it can fix the issue in a robust way. Second, Unless we do something like, if the last part is If we assert that the offload bundler is an AMD only thing (which TBH really looks like so), I'm fine with adding a bunch of more special sauce here. |
offload-bundler is not an AMD only thing. At least HIPSPRV toolchain uses it, which is Intel GPU. Still, I think it is possible to make it generic with minor assumption. Let's say you are now about to parsing the final part of the target ID string which may be either "env-cpu" or "cpu" without env. clang has a function getCanonicalProcessorName() which can check whether a string is a valid cpu name. Just pass the remaining string to it. If true, that means the remaining is a cpu, without env string. Otherwise, assuming there is an env string that contains no "-" and split it. |
This is not actually the issue. The issue is when the cpu is a generic target, such as |
That is probably due to this line https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/OffloadBundler.cpp#L88 It assumes there is no '-' in GPU name. we could add a loop. If that line fails, we will split at the second '-' from right. |
Ok I get your point. since generic is a valid GPU name. it will stop there. |
how about assuming the strict triple format first, this will make the generic GPU arch work with strict triple. If the first assumption fails, then fall back to the legacy parsing, that is, assuming no '-' in GPU arch and split at the right most '-'. This way, the old target ID string with non-strict triple still works. |
c7c5b74
to
bd285ed
Compare
c2588fe
to
e7eac39
Compare
That doesn't work if someone write a 3-tuple with a generic target because neither path can handle that. |
That's true. But your current approach does not work for that use case either, right? For the generic target, we request the 4-tuple. For the old target, we allow the 3-tuple to avoid regressions. |
I don't think we can enforce that.
I've created issues in all the projects mentioned above about the potential intrusive changes in |
2055d82
to
1b13d14
Compare
Could we possibly do a phase-out approach? For COV5 and earlier we keep the current strategy, allowing both 3 and 4 field triples For COV6 (including generics), we enforce the 4 field triples This could help us not break things internally, and then as tools and APIs update to V6, they can also update to 4-field triples |
Isn't offload bundler a target agnostic tool?
PSDB has full green (though some tests were skipped due to PSDB issue but I understand this doesn't mean we wouldn't break anything). I also created issues to all repos that use |
1b13d14
to
0075687
Compare
Would it break anything if we just assumed it was always |
…neric target The current parsing of target string assumes to be in a form of `kind-triple-targetid:feature`, such as `hipv4-amdgcn-amd-amdhsa-gfx1030:+xnack`. Specifically, the target id does not contain any `-`, which is not the case for generic target. Also, a generic target may contain one or more `-`, such as `gfx10-3-generic` and `gfx12-generic`. As a result, we can no longer depend on `rstrip` to get things work right. This patch reworks the logic to parse the target string to make it more robust, as well as supporting generic target.
0075687
to
84b10d4
Compare
The current parsing logic for the target string assumes it follows the format
<kind>-<triple>-<target id>:<feature>
, such ashipv4-amdgcn-amd-amdhsa-gfx1030:+xnack
.Specifically, it assumes that
<target id>
does not contain any-
, relying onrsplit
for parsing.However, this assumption breaks for AMDGPU's generic targets, which may contain one or more
-
, such asgfx10-3-generic
orgfx12-generic
.As a result, the existing approach using
rstrip
is no longer reliable.This patch reworks the parsing logic to handle target strings more robustly, including support for generic targets.
The bundler now strictly requires a 4-field target triple.
Additionally, a new Python helper function has been added to
config.py
to normalize the target triple into the 4-field format when it is not, ensuring tests pass reliably.