public inbox for git-commits@fedoraproject.org
help / color / mirror / Atom feed
From: Tom Rix <Tom.Rix@amd.com>
To: git-commits@fedoraproject.org
Subject: [rpms/rocblas] epel10: Add --with preview
Date: Thu, 11 Jun 2026 14:33:11 GMT [thread overview]
Message-ID: <178118839140.1.11127389856776383527.rpms-rocblas-d4ebec97e49d@fedoraproject.org> (raw)
A new commit has been pushed.
Repo : rpms/rocblas
Branch : epel10
Commit : d4ebec97e49d38d814eb50c8b83abc6cec93b00c
Author : Tom Rix <Tom.Rix@amd.com>
Date : 2026-03-10T11:37:16-07:00
Stats : +986/-39 in 9 file(s)
URL : https://src.fedoraproject.org/rpms/rocblas/c/d4ebec97e49d38d814eb50c8b83abc6cec93b00c?branch=epel10
Log:
Add --with preview
Change tensile source to rocm-libraries
Signed-off-by: Tom Rix <Tom.Rix@amd.com>
---
diff --git a/.gitignore b/.gitignore
index ac175f0..eada8e5 100644
--- a/.gitignore
+++ b/.gitignore
@@ -16,3 +16,4 @@
/Tensile-7.1.1.tar.gz
/Tensile-7.2.0.tar.gz
/rocblas-7.2.0.tar.gz
+/tensile-7.2.0.tar.gz
diff --git a/0001-improve-the-warning-for-asm-caps-mismatches.patch b/0001-improve-the-warning-for-asm-caps-mismatches.patch
new file mode 100644
index 0000000..66005bf
--- /dev/null
+++ b/0001-improve-the-warning-for-asm-caps-mismatches.patch
@@ -0,0 +1,43 @@
+From 22338f7f0aa80c41b04ff4075a9b39957228d219 Mon Sep 17 00:00:00 2001
+From: Tom Rix <Tom.Rix@amd.com>
+Date: Sun, 8 Mar 2026 10:48:50 -0700
+Subject: [PATCH 1/6] improve the warning for asm caps mismatches
+
+This change prints out the different keys/value pairt when there
+is a difference between the derrived and cached asm tables.
+
+Signed-off-by: Tom Rix <Tom.Rix@amd.com>
+---
+ shared/tensile/Tensile/Common.py | 9 +++++++++
+ 1 file changed, 9 insertions(+)
+
+diff --git a/shared/tensile/Tensile/Common.py b/shared/tensile/Tensile/Common.py
+index a7bbf0724a80..b97fa061327b 100644
+--- a/shared/tensile/Tensile/Common.py
++++ b/shared/tensile/Tensile/Common.py
+@@ -2010,6 +2010,14 @@ def locateExe( defaultPath, exeName ): # /opt/rocm/bin, hip-clang
+ return exePath
+ return None
+
++def PrintDiff(d1, d2):
++ keys = set(d1.keys() | d2.keys())
++ for key in keys:
++ v1 = d1.get(key)
++ v2 = d2.get(key)
++ if v1 != v2:
++ printWarning(f"{key}: {v1} != {v2}")
++
+ def GetAsmCaps(isaVersion: IsaVersion, hipVersion: SemanticVersion, cachedAsmCaps: Dict[IsaVersion, dict]) -> Dict[IsaVersion, dict]:
+ """ Determine assembler capabilities by testing short instructions sequences """
+ if globalParameters["AssemblerPath"] is not None:
+@@ -2132,6 +2140,7 @@ def GetAsmCaps(isaVersion: IsaVersion, hipVersion: SemanticVersion, cachedAsmCap
+ exitFlag = True
+ if exitFlag:
+ printWarning("Cached asm caps differ from derived asm caps for {}".format(isaVersion))
++ PrintDiff(derivedAsmCaps, cachedAsmCaps[isaVersion])
+ return derivedAsmCaps
+ else:
+ printWarning("Assembler not present, asm caps loaded from cache are unverified")
+--
+2.53.0
+
diff --git a/0002-add-generic-gpu-targets.patch b/0002-add-generic-gpu-targets.patch
new file mode 100644
index 0000000..68b8e28
--- /dev/null
+++ b/0002-add-generic-gpu-targets.patch
@@ -0,0 +1,591 @@
+From 60c8c0786b61e1ab2040f7b6d7b6c2b4b244c9e1 Mon Sep 17 00:00:00 2001
+From: Tom Rix <Tom.Rix@amd.com>
+Date: Sun, 8 Mar 2026 01:32:28 +0000
+Subject: [PATCH 2/6] add generic gpu targets
+
+To support generic gpu targets ex/ -DGPU_TARGETS=gfx11-generic.
+
+Tensile does not have support for every possible gpu target. Instead
+of adding then piecement, provide support for all the generic targets.
+
+In Common.py overload int tuple for SupportedISA, where if the last
+value is negative, then this is a generic isa.
+Ex
+ (10,3,-1) -> gfx10-3-generic
+ (11,0,-1) -> gfx11-generic
+
+In AsmCaps, cut-n-paste generic tables from a close existing table.
+ex/ (10,3,0) was used of (10,3,-1). Then fix the values based on
+the derrived vs cached warnings during a build.
+
+Add new mapping where appropriate.
+
+Signed-off-by: Tom Rix <Tom.Rix@amd.com>
+---
+ shared/tensile/Tensile/AsmCaps.py | 264 ++++++++++++++++++
+ shared/tensile/Tensile/Common.py | 57 +++-
+ .../cmake/TensileSupportedArchitectures.cmake | 9 +-
+ .../Source/lib/include/Tensile/AMDGPU.hpp | 44 ++-
+ .../include/Tensile/PlaceholderLibrary.hpp | 18 ++
+ 5 files changed, 375 insertions(+), 17 deletions(-)
+
+diff --git a/shared/tensile/Tensile/AsmCaps.py b/shared/tensile/Tensile/AsmCaps.py
+index 48eeec1f9a6c..58776e249b78 100644
+--- a/shared/tensile/Tensile/AsmCaps.py
++++ b/shared/tensile/Tensile/AsmCaps.py
+@@ -169,6 +169,50 @@ def getCapabilitiesCache(rocmVersion: NamedTuple) -> dict:
+ 'v_mov_b64': False,
+ 'v_pk_fma_f16': True,
+ 'v_pk_fmac_f16': False},
++ (9, 0, -1): {'HasAddLshl': True,
++ 'HasAtomicAdd': False,
++ 'HasDirectToLdsDest': False,
++ 'HasDirectToLdsNoDest': True,
++ 'HasExplicitCO': True,
++ 'HasExplicitNC': False,
++ 'HasGLCModifier': True,
++ 'HasNTModifier': False,
++ 'HasLshlOr': True,
++ 'HasMFMA': False,
++ 'HasMFMA_b8': False,
++ 'HasMFMA_bf16_1k': False,
++ 'HasMFMA_bf16_original': False,
++ 'HasMFMA_constSrc': False,
++ 'HasMFMA_f64': False,
++ 'HasMFMA_f8': False,
++ 'HasMFMA_i8_908': False,
++ 'HasMFMA_i8_940': False,
++ 'HasMFMA_vgpr': False,
++ 'HasMFMA_xf32': False,
++ 'HasSMulHi': True,
++ 'HasWMMA': False,
++ 'KernargPreloading': False,
++ 'MaxLgkmcnt': 15,
++ 'MaxVmcnt': 63,
++ 'SupportedISA': True,
++ 'SupportedSource': True,
++ 'VOP3v_dot4_i32_i8': False,
++ 'v_dot2_f32_f16': False,
++ 'v_dot2c_f32_f16': False,
++ 'v_dot4_i32_i8': False,
++ 'v_dot4c_i32_i8': False,
++ 'v_fma_f16': True,
++ 'v_fma_f32': True,
++ 'v_fma_f64': True,
++ 'v_fma_mix_f32': False,
++ 'v_fmac_f16': False,
++ 'v_fmac_f32': False,
++ 'v_mac_f16': True,
++ 'v_mac_f32': True,
++ 'v_mad_mix_f32': False,
++ 'v_mov_b64': False,
++ 'v_pk_fma_f16': True,
++ 'v_pk_fmac_f16': False},
+ (9, 0, 6): {'HasAddLshl': True,
+ 'HasAtomicAdd': False,
+ 'HasDirectToLdsDest': False,
+@@ -345,6 +389,50 @@ def getCapabilitiesCache(rocmVersion: NamedTuple) -> dict:
+ 'v_mov_b64': True,
+ 'v_pk_fma_f16': True,
+ 'v_pk_fmac_f16': False},
++ (9, 4, -1): {'HasAddLshl': True,
++ 'HasAtomicAdd': True,
++ 'HasDirectToLdsDest': False,
++ 'HasDirectToLdsNoDest': True,
++ 'HasExplicitCO': True,
++ 'HasExplicitNC': False,
++ 'HasGLCModifier': False,
++ 'HasNTModifier': True,
++ 'HasLshlOr': True,
++ 'HasMFMA': True,
++ 'HasMFMA_b8': False,
++ 'HasMFMA_bf16_1k': True,
++ 'HasMFMA_bf16_original': False,
++ 'HasMFMA_constSrc': True,
++ 'HasMFMA_f64': True,
++ 'HasMFMA_f8': False,
++ 'HasMFMA_i8_908': False,
++ 'HasMFMA_i8_940': True,
++ 'HasMFMA_vgpr': True,
++ 'HasMFMA_xf32': False,
++ 'HasSMulHi': True,
++ 'HasWMMA': False,
++ 'KernargPreloading': True,
++ 'MaxLgkmcnt': 15,
++ 'MaxVmcnt': 63,
++ 'SupportedISA': True,
++ 'SupportedSource': True,
++ 'VOP3v_dot4_i32_i8': True,
++ 'v_dot2_f32_f16': True,
++ 'v_dot2c_f32_f16': True,
++ 'v_dot4_i32_i8': False,
++ 'v_dot4c_i32_i8': True,
++ 'v_fma_f16': True,
++ 'v_fma_f32': True,
++ 'v_fma_f64': True,
++ 'v_fma_mix_f32': True,
++ 'v_fmac_f16': False,
++ 'v_fmac_f32': True,
++ 'v_mac_f16': True,
++ 'v_mac_f32': False,
++ 'v_mad_mix_f32': False,
++ 'v_mov_b64': True,
++ 'v_pk_fma_f16': True,
++ 'v_pk_fmac_f16': False},
+ (9, 5, 0): {'HasAddLshl': True,
+ 'HasAtomicAdd': True,
+ 'HasDirectToLdsDest': False,
+@@ -433,6 +521,50 @@ def getCapabilitiesCache(rocmVersion: NamedTuple) -> dict:
+ 'v_mov_b64': False,
+ 'v_pk_fma_f16': True,
+ 'v_pk_fmac_f16': False},
++ (10, 1, -1): {'HasAddLshl': True,
++ 'HasAtomicAdd': False,
++ 'HasDirectToLdsDest': False,
++ 'HasDirectToLdsNoDest': True,
++ 'HasExplicitCO': True,
++ 'HasExplicitNC': True,
++ 'HasGLCModifier': True,
++ 'HasNTModifier': False,
++ 'HasLshlOr': True,
++ 'HasMFMA': False,
++ 'HasMFMA_b8': False,
++ 'HasMFMA_bf16_1k': False,
++ 'HasMFMA_bf16_original': False,
++ 'HasMFMA_constSrc': False,
++ 'HasMFMA_f64': False,
++ 'HasMFMA_f8': False,
++ 'HasMFMA_i8_908': False,
++ 'HasMFMA_i8_940': False,
++ 'HasMFMA_vgpr': False,
++ 'HasMFMA_xf32': False,
++ 'HasSMulHi': True,
++ 'HasWMMA': False,
++ 'KernargPreloading': False,
++ 'MaxLgkmcnt': 15,
++ 'MaxVmcnt': 63,
++ 'SupportedISA': True,
++ 'SupportedSource': True,
++ 'VOP3v_dot4_i32_i8': False,
++ 'v_dot2_f32_f16': False,
++ 'v_dot2c_f32_f16': False,
++ 'v_dot4_i32_i8': False,
++ 'v_dot4c_i32_i8': False,
++ 'v_fma_f16': True,
++ 'v_fma_f32': True,
++ 'v_fma_f64': True,
++ 'v_fma_mix_f32': True,
++ 'v_fmac_f16': False,
++ 'v_fmac_f32': True,
++ 'v_mac_f16': False,
++ 'v_mac_f32': True,
++ 'v_mad_mix_f32': False,
++ 'v_mov_b64': False,
++ 'v_pk_fma_f16': True,
++ 'v_pk_fmac_f16': False},
+ (10, 1, 1): {'HasAddLshl': True,
+ 'HasAtomicAdd': False,
+ 'HasDirectToLdsDest': False,
+@@ -565,6 +697,50 @@ def getCapabilitiesCache(rocmVersion: NamedTuple) -> dict:
+ 'v_mov_b64': False,
+ 'v_pk_fma_f16': True,
+ 'v_pk_fmac_f16': False},
++ (10, 3, -1): {'HasAddLshl': True,
++ 'HasAtomicAdd': False,
++ 'HasDirectToLdsDest': False,
++ 'HasDirectToLdsNoDest': True,
++ 'HasExplicitCO': True,
++ 'HasExplicitNC': True,
++ 'HasGLCModifier': True,
++ 'HasNTModifier': False,
++ 'HasLshlOr': True,
++ 'HasMFMA': False,
++ 'HasMFMA_b8': False,
++ 'HasMFMA_bf16_1k': False,
++ 'HasMFMA_bf16_original': False,
++ 'HasMFMA_constSrc': False,
++ 'HasMFMA_f64': False,
++ 'HasMFMA_f8': False,
++ 'HasMFMA_i8_908': False,
++ 'HasMFMA_i8_940': False,
++ 'HasMFMA_vgpr': False,
++ 'HasMFMA_xf32': False,
++ 'HasSMulHi': True,
++ 'HasWMMA': False,
++ 'KernargPreloading': False,
++ 'MaxLgkmcnt': 15,
++ 'MaxVmcnt': 63,
++ 'SupportedISA': True,
++ 'SupportedSource': True,
++ 'VOP3v_dot4_i32_i8': True,
++ 'v_dot2_f32_f16': True,
++ 'v_dot2c_f32_f16': True,
++ 'v_dot4_i32_i8': False,
++ 'v_dot4c_i32_i8': True,
++ 'v_fma_f16': True,
++ 'v_fma_f32': True,
++ 'v_fma_f64': True,
++ 'v_fma_mix_f32': True,
++ 'v_fmac_f16': False,
++ 'v_fmac_f32': True,
++ 'v_mac_f16': False,
++ 'v_mac_f32': False,
++ 'v_mad_mix_f32': False,
++ 'v_mov_b64': False,
++ 'v_pk_fma_f16': True,
++ 'v_pk_fmac_f16': False},
+ (10, 3, 1): {'HasAddLshl': True,
+ 'HasAtomicAdd': False,
+ 'HasDirectToLdsDest': False,
+@@ -873,6 +1049,50 @@ def getCapabilitiesCache(rocmVersion: NamedTuple) -> dict:
+ 'v_mov_b64': False,
+ 'v_pk_fma_f16': True,
+ 'v_pk_fmac_f16': False},
++ (11, 0, -1): {'HasAddLshl': True,
++ 'HasAtomicAdd': True,
++ 'HasDirectToLdsDest': False,
++ 'HasDirectToLdsNoDest': False,
++ 'HasExplicitCO': True,
++ 'HasExplicitNC': True,
++ 'HasGLCModifier': True,
++ 'HasNTModifier': False,
++ 'HasLshlOr': True,
++ 'HasMFMA': False,
++ 'HasMFMA_b8': False,
++ 'HasMFMA_bf16_1k': False,
++ 'HasMFMA_bf16_original': False,
++ 'HasMFMA_constSrc': False,
++ 'HasMFMA_f64': False,
++ 'HasMFMA_f8': False,
++ 'HasMFMA_i8_908': False,
++ 'HasMFMA_i8_940': False,
++ 'HasMFMA_vgpr': False,
++ 'HasMFMA_xf32': False,
++ 'HasSMulHi': True,
++ 'HasWMMA': True,
++ 'KernargPreloading': False,
++ 'MaxLgkmcnt': 15,
++ 'MaxVmcnt': 63,
++ 'SupportedISA': True,
++ 'SupportedSource': True,
++ 'VOP3v_dot4_i32_i8': True,
++ 'v_dot2_f32_f16': True,
++ 'v_dot2c_f32_f16': True,
++ 'v_dot4_i32_i8': False,
++ 'v_dot4c_i32_i8': False,
++ 'v_fma_f16': True,
++ 'v_fma_f32': True,
++ 'v_fma_f64': True,
++ 'v_fma_mix_f32': True,
++ 'v_fmac_f16': False,
++ 'v_fmac_f32': True,
++ 'v_mac_f16': False,
++ 'v_mac_f32': False,
++ 'v_mad_mix_f32': False,
++ 'v_mov_b64': False,
++ 'v_pk_fma_f16': True,
++ 'v_pk_fmac_f16': False},
+ (11, 0, 1): {'HasAddLshl': True,
+ 'HasAtomicAdd': True,
+ 'HasDirectToLdsDest': False,
+@@ -1225,6 +1445,50 @@ def getCapabilitiesCache(rocmVersion: NamedTuple) -> dict:
+ 'v_mov_b64': False,
+ 'v_pk_fma_f16': True,
+ 'v_pk_fmac_f16': False},
++ (12, 0, -1): {'HasAddLshl': True,
++ 'HasAtomicAdd': False,
++ 'HasDirectToLdsDest': False,
++ 'HasDirectToLdsNoDest': False,
++ 'HasExplicitCO': True,
++ 'HasExplicitNC': True,
++ 'HasGLCModifier': False,
++ 'HasNTModifier': False,
++ 'HasLshlOr': True,
++ 'HasMFMA': False,
++ 'HasMFMA_b8': False,
++ 'HasMFMA_bf16_1k': False,
++ 'HasMFMA_bf16_original': False,
++ 'HasMFMA_constSrc': False,
++ 'HasMFMA_f64': False,
++ 'HasMFMA_f8': False,
++ 'HasMFMA_i8_908': False,
++ 'HasMFMA_i8_940': False,
++ 'HasMFMA_vgpr': False,
++ 'HasMFMA_xf32': False,
++ 'HasSMulHi': True,
++ 'HasWMMA': False,
++ 'KernargPreloading': False,
++ 'MaxLgkmcnt': 15,
++ 'MaxVmcnt': 63,
++ 'SupportedISA': True,
++ 'SupportedSource': True,
++ 'VOP3v_dot4_i32_i8': True,
++ 'v_dot2_f32_f16': True,
++ 'v_dot2c_f32_f16': False,
++ 'v_dot4_i32_i8': False,
++ 'v_dot4c_i32_i8': False,
++ 'v_fma_f16': True,
++ 'v_fma_f32': True,
++ 'v_fma_f64': True,
++ 'v_fma_mix_f32': True,
++ 'v_fmac_f16': False,
++ 'v_fmac_f32': True,
++ 'v_mac_f16': False,
++ 'v_mac_f32': False,
++ 'v_mad_mix_f32': False,
++ 'v_mov_b64': False,
++ 'v_pk_fma_f16': True,
++ 'v_pk_fmac_f16': False},
+ (12, 0, 1): {'HasAddLshl': True,
+ 'HasAtomicAdd': False,
+ 'HasDirectToLdsDest': False,
+diff --git a/shared/tensile/Tensile/Common.py b/shared/tensile/Tensile/Common.py
+index b97fa061327b..9a2c399fad1b 100644
+--- a/shared/tensile/Tensile/Common.py
++++ b/shared/tensile/Tensile/Common.py
+@@ -246,12 +246,12 @@ globalParameters["NumMergedFiles"] = 1 # The number of files that ker
+
+ globalParameters["MaxFileName"] = 64 # If a file name would be longer than this, shorten it with a hash.
+ globalParameters["SupportedISA"] = [(8,0,3),
+- (9,0,0), (9,0,6), (9,0,8), (9,0,10),
+- (9,4,2), (9,5,0),
+- (10,1,0), (10,1,1), (10,1,2), (10,3,0), (10,3,1), (10,3,2), (10,3,3), (10,3,4), (10,3,5), (10,3,6),
+- (11,0,0), (11,0,1), (11,0,2), (11,0,3),
++ (9,0,0), (9,0,6), (9,0,8), (9,0,10), (9,0,-1),
++ (9,4,2), (9,4,-1), (9,5,0),
++ (10,1,0), (10,1,1), (10,1,2), (10,1,-1), (10,3,0), (10,3,1), (10,3,2), (10,3,3), (10,3,4), (10,3,5), (10,3,6), (10,3,-1),
++ (11,0,0), (11,0,1), (11,0,2), (11,0,3), (11,0,-1),
+ (11,5,0), (11,5,1), (11,5,2), (11,5,3),
+- (12,0,0), (12,0,1)] # assembly kernels writer supports these architectures
++ (12,0,0), (12,0,1), (12,0,-1)] # assembly kernels writer supports these architectures
+
+ globalParameters["KeepBuildTmp"] = True # Do not remove build artifacts during the build process or build_tmp after build completes
+ globalParameters["GenerateManifestAndExit"] = False # Output manifest file with list of expected library objects and exit
+@@ -320,15 +320,15 @@ architectureMap = {
+ 'gfx803':'r9nano', 'gfx900':'vega10', 'gfx900:xnack-':'vega10',
+ 'gfx906':'vega20', 'gfx906:xnack+':'vega20', 'gfx906:xnack-':'vega20',
+ 'gfx908':'arcturus','gfx908:xnack+':'arcturus', 'gfx908:xnack-':'arcturus',
+- 'gfx90a':'aldebaran', 'gfx90a:xnack+':'aldebaran', 'gfx90a:xnack-':'aldebaran',
+- 'gfx942':'aquavanjaram942', 'gfx942:xnack+':'aquavanjaram942', 'gfx942:xnack-':'aquavanjaram942',
++ 'gfx90a':'aldebaran', 'gfx90a:xnack+':'aldebaran', 'gfx90a:xnack-':'aldebaran', 'gfx9-generic':'gfx9-generic',
++ 'gfx942':'aquavanjaram942', 'gfx942:xnack+':'aquavanjaram942', 'gfx942:xnack-':'aquavanjaram942', 'gfx9-4-generic':'gfx9-4-generic',
+ 'gfx950':'gfx950', 'gfx950:xnack+':'gfx950', 'gfx950:xnack-':'gfx950',
+- 'gfx1010':'navi10', 'gfx1011':'navi12', 'gfx1012':'navi14',
+- 'gfx1030':'navi21', 'gfx1031':'navi22', 'gfx1032':'navi23', 'gfx1033':'van gogh', 'gfx1034':'navi24', 'gfx1035':'rembrandt', 'gfx1036':'raphael',
+- 'gfx1100':'navi31', 'gfx1101':'navi32', 'gfx1102':'navi33', 'gfx1103':'gfx1103',
++ 'gfx1010':'navi10', 'gfx1011':'navi12', 'gfx1012':'navi14', 'gfx10-1-generic':'gfx10-1-generic',
++ 'gfx1030':'navi21', 'gfx1031':'navi22', 'gfx1032':'navi23', 'gfx1033':'van gogh', 'gfx1034':'navi24', 'gfx1035':'rembrandt', 'gfx1036':'raphael', 'gfx10-3-generic':'gfx10-3-generic',
++ 'gfx1100':'navi31', 'gfx1101':'navi32', 'gfx1102':'navi33', 'gfx1103':'gfx1103', 'gfx11-generic':'gfx11-generic',
+ 'gfx1150':'strixpoint', 'gfx1151':'strixhalo', 'gfx1152':'gfx1152', 'gfx1153':'gfx1153',
+ 'gfx1200':'gfx1200',
+- 'gfx1201':'gfx1201'
++ 'gfx1201':'gfx1201', 'gfx12-generic':'gfx12-generic',
+ }
+
+ def getArchitectureName(gfxName: str) -> Optional[str]:
+@@ -2201,6 +2201,21 @@ def tryAssembler(isaVersion, asmString, debug=False, *options):
+
+ def gfxArch(name: str) -> Optional[IsaVersion]:
+ import re
++
++ # Handle special case for generic architectures like 'gfx10-3-generic'
++ generic_match = re.search(r'gfx([0-9]+)-([0-9]+)-generic', name)
++ if generic_match:
++ major = int(generic_match.group(1))
++ minor = int(generic_match.group(2))
++ return (major, minor, -1) # step=-1 to indicate generic
++
++ # Handle special case for generic architectures like 'gfx11-generic'
++ generic_match = re.search(r'gfx([0-9]+)-generic', name)
++ if generic_match:
++ major = int(generic_match.group(1))
++ return (major, 0, -1) # step=-1 to indicate generic, minor=0
++
++ # Handle regular architectures like 'gfx900', 'gfx803' etc.
+ match = re.search(r'gfx([0-9a-fA-F]{3,})', name)
+ if not match: return None
+
+@@ -2219,11 +2234,23 @@ def gfxArch(name: str) -> Optional[IsaVersion]:
+ return rv
+
+ def gfxName(arch):
+- # convert last digit to hex because reasons
+- name = str(arch[0]) + str(arch[1]) + ('%x' % arch[2])
++ # If arch[2] is negative, this is a generic target
++ if arch[2] < 0:
++ if arch[0] == 9:
++ if arch[1] == 4:
++ name = str(arch[0]) + '-' + str(arch[1]) + '-generic'
++ else:
++ name = str(arch[0]) + '-generic'
++ elif arch[0] == 10:
++ name = str(arch[0]) + '-' + str(arch[1]) + '-generic'
++ else:
++ name = str(arch[0]) + '-generic'
++ else:
++ # The normal case
++ # convert last digit to hex because reasons
++ name = str(arch[0]) + str(arch[1]) + ('%x' % arch[2])
+ return 'gfx' + ''.join(map(str,name))
+
+-
+ def detectIsaWindows(output):
+ i = 0
+ for line in output:
+@@ -2475,7 +2502,7 @@ def assignGlobalParameters( config, capabilitiesCache: Optional[dict] = None ):
+ if os.name == "nt":
+ globalParameters["CurrentISA"] = (9,0,6)
+ printWarning("Failed to detect ISA so forcing (gfx906) on windows")
+- isasWithDisabledHWMonitor = ((9,4,2), (9,5,0), (11,0,0), (11,0,1), (11,0,2), (11,0,3), (11,5,0), (11,5,1), (11,5,2), (11,5,3), (12,0,0), (12,0,1))
++ isasWithDisabledHWMonitor = ((9,0,-1), (9,4,2), (9,4,-1), (9,5,0), (10,1,-1), (10,3,-1), (11,0,0), (11,0,1), (11,0,2), (11,0,3), (11,5,0), (11,5,1), (11,5,2), (11,5,3), (11,0,-1), (12,0,0), (12,0,1), (12,0,-1))
+ if globalParameters["CurrentISA"] in isasWithDisabledHWMonitor:
+ isaString = ', '.join(map(gfxName, isasWithDisabledHWMonitor))
+ printWarning(f"HardwareMonitor currently disabled for {isaString}")
+diff --git a/shared/tensile/Tensile/Source/cmake/TensileSupportedArchitectures.cmake b/shared/tensile/Tensile/Source/cmake/TensileSupportedArchitectures.cmake
+index a1fb7166cf63..5f3e2d54a003 100644
+--- a/shared/tensile/Tensile/Source/cmake/TensileSupportedArchitectures.cmake
++++ b/shared/tensile/Tensile/Source/cmake/TensileSupportedArchitectures.cmake
+@@ -35,11 +35,14 @@ if(NOT BUILD_ADDRESS_SANITIZER)
+ "gfx906"
+ "gfx908"
+ "gfx90a"
++ "gfx9-generic"
+ "gfx942"
++ "gfx9-4-generic"
+ "gfx950"
+ "gfx1010"
+ "gfx1011"
+ "gfx1012"
++ "gfx10-1-generic"
+ "gfx1030"
+ "gfx1031"
+ "gfx1032"
+@@ -47,6 +50,7 @@ if(NOT BUILD_ADDRESS_SANITIZER)
+ "gfx1034"
+ "gfx1035"
+ "gfx1036"
++ "gfx10-3-generic"
+ "gfx1100"
+ "gfx1101"
+ "gfx1102"
+@@ -55,8 +59,11 @@ if(NOT BUILD_ADDRESS_SANITIZER)
+ "gfx1151"
+ "gfx1152"
+ "gfx1153"
++ "gfx11-generic"
+ "gfx1200"
+- "gfx1201")
++ "gfx1201"
++ "gfx12-generic"
++ )
+
+ set(SUPPORTED_ARCHITECTURES ${BASE_ARCHITECTURES})
+ list(APPEND SUPPORTED_ARCHITECTURES
+diff --git a/shared/tensile/Tensile/Source/lib/include/Tensile/AMDGPU.hpp b/shared/tensile/Tensile/Source/lib/include/Tensile/AMDGPU.hpp
+index 1d22bfe712da..be9d5a78c077 100644
+--- a/shared/tensile/Tensile/Source/lib/include/Tensile/AMDGPU.hpp
++++ b/shared/tensile/Tensile/Source/lib/include/Tensile/AMDGPU.hpp
+@@ -81,7 +81,13 @@ namespace Tensile
+ gfx1152 = 1152,
+ gfx1153 = 1153,
+ gfx1200 = 1200,
+- gfx1201 = 1201
++ gfx1201 = 1201,
++ gfx9_generic = -900,
++ gfx9_4_generic = -940,
++ gfx10_1_generic = -1010,
++ gfx10_3_generic = -1030,
++ gfx11_generic = -1100,
++ gfx12_generic = -1200,
+ };
+
+ static std::string toString(Processor p)
+@@ -142,6 +148,18 @@ namespace Tensile
+ return "gfx1200";
+ case AMDGPU::Processor::gfx1201:
+ return "gfx1201";
++ case AMDGPU::Processor::gfx9_generic:
++ return "gfx9-generic";
++ case AMDGPU::Processor::gfx9_4_generic:
++ return "gfx9-4-generic";
++ case AMDGPU::Processor::gfx10_1_generic:
++ return "gfx10-1-generic";
++ case AMDGPU::Processor::gfx10_3_generic:
++ return "gfx10-3-generic";
++ case AMDGPU::Processor::gfx11_generic:
++ return "gfx11-generic";
++ case AMDGPU::Processor::gfx12_generic:
++ return "gfx12-generic";
+ }
+ return "";
+ }
+@@ -256,6 +274,30 @@ namespace Tensile
+ {
+ return AMDGPU::Processor::gfx1201;
+ }
++ else if(deviceString.find("gfx9-generic") != std::string::npos)
++ {
++ return AMDGPU::Processor::gfx9_generic;
++ }
++ else if(deviceString.find("gfx9-4-generic") != std::string::npos)
++ {
++ return AMDGPU::Processor::gfx9_4_generic;
++ }
++ else if(deviceString.find("gfx10-1-generic") != std::string::npos)
++ {
++ return AMDGPU::Processor::gfx10_1_generic;
++ }
++ else if(deviceString.find("gfx10-3-generic") != std::string::npos)
++ {
++ return AMDGPU::Processor::gfx10_3_generic;
++ }
++ else if(deviceString.find("gfx11-generic") != std::string::npos)
++ {
++ return AMDGPU::Processor::gfx11_generic;
++ }
++ else if(deviceString.find("gfx12-generic") != std::string::npos)
++ {
++ return AMDGPU::Processor::gfx12_generic;
++ }
+ else
+ {
+ return static_cast<AMDGPU::Processor>(0);
+diff --git a/shared/tensile/Tensile/Source/lib/include/Tensile/PlaceholderLibrary.hpp b/shared/tensile/Tensile/Source/lib/include/Tensile/PlaceholderLibrary.hpp
+index a9da044e8f39..2f8b18779936 100644
+--- a/shared/tensile/Tensile/Source/lib/include/Tensile/PlaceholderLibrary.hpp
++++ b/shared/tensile/Tensile/Source/lib/include/Tensile/PlaceholderLibrary.hpp
+@@ -66,6 +66,12 @@ namespace Tensile
+ gfx1153,
+ gfx1200,
+ gfx1201,
++ gfx9_generic,
++ gfx9_4_generic,
++ gfx10_1_generic,
++ gfx10_3_generic,
++ gfx11_generic,
++ gfx12_generic,
+ All
+ };
+
+@@ -130,6 +136,18 @@ namespace Tensile
+ return "TensileLibrary_*_gfx1200";
+ case LazyLoadingInit::gfx1201:
+ return "TensileLibrary_*_gfx1201";
++ case LazyLoadingInit::gfx9_generic:
++ return "TensileLibrary_*_gfx9-generic";
++ case LazyLoadingInit::gfx9_4_generic:
++ return "TensileLibrary_*_gfx9-4-generic";
++ case LazyLoadingInit::gfx10_1_generic:
++ return "TensileLibrary_*_gfx10-1-generic";
++ case LazyLoadingInit::gfx10_3_generic:
++ return "TensileLibrary_*_gfx10-3-generic";
++ case LazyLoadingInit::gfx11_generic:
++ return "TensileLibrary_*_gfx11-generic";
++ case LazyLoadingInit::gfx12_generic:
++ return "TensileLibrary_*_gfx12-generic";
+ case LazyLoadingInit::None:
+ return "";
+ }
+--
+2.53.0
+
diff --git a/0003-improve-fallback-name-to-handle-generics.patch b/0003-improve-fallback-name-to-handle-generics.patch
new file mode 100644
index 0000000..68859a0
--- /dev/null
+++ b/0003-improve-fallback-name-to-handle-generics.patch
@@ -0,0 +1,32 @@
+From 6f042a916612aca518254d5870590d15ec7a16e6 Mon Sep 17 00:00:00 2001
+From: Tom Rix <Tom.Rix@amd.com>
+Date: Sun, 8 Mar 2026 13:38:28 -0700
+Subject: [PATCH 3/6] improve fallback name to handle generics
+
+The archName can be of the form gfx90a-xnack{+,-} and this function
+determines the fallback is gfx90a. However when the archName is
+a generic, ex gfx11-generic, the entire name must be used. So
+check if the name ends with -generic and skip splitting.
+
+Signed-off-by: Tom Rix <Tom.Rix@amd.com>
+---
+ shared/tensile/Tensile/TensileCreateLibrary.py | 3 ++-
+ 1 file changed, 2 insertions(+), 1 deletion(-)
+
+diff --git a/shared/tensile/Tensile/TensileCreateLibrary.py b/shared/tensile/Tensile/TensileCreateLibrary.py
+index 543b0379c41e..eb7147a4fd8a 100644
+--- a/shared/tensile/Tensile/TensileCreateLibrary.py
++++ b/shared/tensile/Tensile/TensileCreateLibrary.py
+@@ -962,7 +962,8 @@ def addFallback(masterLibraries: Dict[str, MasterSolutionLibrary]) -> None:
+ value.insert(masterLibraries["fallback"])
+
+ for archName in archs:
+- archName = archName.split("-", 1)[0]
++ if not archName.endswith("-generic"):
++ archName = archName.split("-", 1)[0]
+ if archName not in masterLibraries:
+ tPrint(1, "Using fallback for arch: " + archName)
+ masterLibraries[archName] = masterLibraries["fallback"]
+--
+2.53.0
+
diff --git a/0004-generic-arches-need-a-solution-index.patch b/0004-generic-arches-need-a-solution-index.patch
new file mode 100644
index 0000000..be1b231
--- /dev/null
+++ b/0004-generic-arches-need-a-solution-index.patch
@@ -0,0 +1,45 @@
+From 71f280ea73630c0453fda896a36d0b3092b95aed Mon Sep 17 00:00:00 2001
+From: Tom Rix <Tom.Rix@amd.com>
+Date: Sun, 8 Mar 2026 16:21:07 -0700
+Subject: [PATCH 4/6] generic arches need a solution index
+
+So there is no overlap with the regular gpu indecies, pick
+a shift value that does not overlap.
+
+(9 << 29) >> 18 = 18432
+
+Signed-off-by: Tom Rix <Tom.Rix@amd.com>
+---
+ shared/tensile/Tensile/SolutionLibrary.py | 9 +++++++--
+ 1 file changed, 7 insertions(+), 2 deletions(-)
+
+diff --git a/shared/tensile/Tensile/SolutionLibrary.py b/shared/tensile/Tensile/SolutionLibrary.py
+index 0c7b6428d624..e7c4b7457737 100644
+--- a/shared/tensile/Tensile/SolutionLibrary.py
++++ b/shared/tensile/Tensile/SolutionLibrary.py
+@@ -255,7 +255,7 @@ class MasterSolutionLibrary:
+ """Maps hex characters from gfx name to an index.
+
+ Given a gfx name of the form gfx[0-9a-f]*, map the characters following
+- gfx from hex to int and left shift the integer by 18.
++ gfx from hex to int and left shift the integer by 18 (or 29 for generic architectures).
+
+ Args:
+ architectureName: The gfx name (or fallback).
+@@ -273,7 +273,12 @@ class MasterSolutionLibrary:
+ archString = re.search('(?<=gfx)[0-9a-f]*', architectureName)
+ if archString is not None:
+ archLiteral = archString.group(0)
+- archval = (int(archLiteral, 16) << 18)
++ # Use left shift of 29 for generic architectures, 18 otherwise
++ if architectureName.endswith("-generic"):
++ shift_bits = 29
++ else:
++ shift_bits = 18
++ archval = (int(archLiteral, 16) << shift_bits)
+ # Check for duplicate architecture values
+ if archval >= 0 and not archval in cls.ArchitectureSet:
+ cls.ArchitectureSet.add(archval)
+--
+2.53.0
+
diff --git a/0005-rocblas-add-rocblas_internal_get_generic_arch_name.patch b/0005-rocblas-add-rocblas_internal_get_generic_arch_name.patch
new file mode 100644
index 0000000..19d3b70
--- /dev/null
+++ b/0005-rocblas-add-rocblas_internal_get_generic_arch_name.patch
@@ -0,0 +1,89 @@
+From 8926fb0fca00d1ff859682b3df91243cff650425 Mon Sep 17 00:00:00 2001
+From: Tom Rix <Tom.Rix@amd.com>
+Date: Mon, 9 Mar 2026 18:15:43 -0700
+Subject: [PATCH 5/6] [rocblas] add rocblas_internal_get_generic_arch_name
+
+A function similar to rocblas_internal_get_arch_name,
+returns the generic name for the arch.
+
+Signed-off-by: Tom Rix <Tom.Rix@amd.com>
+---
+ .../rocblas/library/src/include/utility.hpp | 3 ++
+ .../rocblas/library/src/rocblas_auxiliary.cpp | 47 +++++++++++++++++++
+ 2 files changed, 50 insertions(+)
+
+diff --git a/projects/rocblas/library/src/include/utility.hpp b/projects/rocblas/library/src/include/utility.hpp
+index bb4212f78630..966958c9aca4 100644
+--- a/projects/rocblas/library/src/include/utility.hpp
++++ b/projects/rocblas/library/src/include/utility.hpp
+@@ -800,6 +800,9 @@ bool rocblas_internal_tensile_supports_ldc_ne_ldd(rocblas_handle handle);
+ // We assume true if the value is between 942 to 1000
+ ROCBLAS_INTERNAL_EXPORT bool rocblas_internal_tensile_supports_xdl_math_op(rocblas_math_mode mode);
+
++// for internal use
++ROCBLAS_INTERNAL_EXPORT std::string rocblas_internal_get_generic_arch_name();
++
+ // for internal use during testing, fetch arch name
+ ROCBLAS_INTERNAL_EXPORT std::string rocblas_internal_get_arch_name();
+
+diff --git a/projects/rocblas/library/src/rocblas_auxiliary.cpp b/projects/rocblas/library/src/rocblas_auxiliary.cpp
+index 57c24a9f519d..3f7c375eefc4 100644
+--- a/projects/rocblas/library/src/rocblas_auxiliary.cpp
++++ b/projects/rocblas/library/src/rocblas_auxiliary.cpp
+@@ -917,6 +917,53 @@ bool rocblas_internal_tensile_supports_xdl_math_op(rocblas_math_mode mode)
+ return (deviceString.find("gfx942") != std::string::npos);
+ }
+
++std::string rocblas_internal_get_generic_arch_name()
++{
++ std::string arch_name = rocblas_internal_get_arch_name();
++ // Map specific architecture names to generic names
++ static const std::map<std::string, std::string> arch_map = {
++ {"gfx900", "gfx9-generic"},
++ {"gfx902", "gfx9-generic"},
++ {"gfx904", "gfx9-generic"},
++ {"gfx906", "gfx9-generic"},
++ {"gfx908", "gfx9-generic"},
++ {"gfx909", "gfx9-generic"},
++ {"gfx90a", "gfx9-generic"},
++ {"gfx940", "gfx9-4-generic"},
++ {"gfx941", "gfx9-4-generic"},
++ {"gfx942", "gfx9-4-generic"},
++ {"gfx1010", "gfx10-1-generic"},
++ {"gfx1011", "gfx10-1-generic"},
++ {"gfx1012", "gfx10-1-generic"},
++ {"gfx1013", "gfx10-1-generic"},
++ {"gfx1030", "gfx10-3-generic"},
++ {"gfx1031", "gfx10-3-generic"},
++ {"gfx1032", "gfx10-3-generic"},
++ {"gfx1033", "gfx10-3-generic"},
++ {"gfx1034", "gfx10-3-generic"},
++ {"gfx1035", "gfx10-3-generic"},
++ {"gfx1036", "gfx10-3-generic"},
++ {"gfx1100", "gfx11-generic"},
++ {"gfx1101", "gfx11-generic"},
++ {"gfx1102", "gfx11-generic"},
++ {"gfx1103", "gfx11-generic"},
++ {"gfx1150", "gfx11-generic"},
++ {"gfx1151", "gfx11-generic"},
++ {"gfx1152", "gfx11-generic"},
++ {"gfx1153", "gfx11-generic"},
++ {"gfx1200", "gfx12-generic"},
++ {"gfx1201", "gfx12-generic"},
++ {"gfx1250", "gfx12-generic"},
++ {"gfx1251", "gfx12-generic"}
++ };
++
++ auto it = arch_map.find(arch_name);
++ if(it != arch_map.end())
++ return it->second;
++
++ // Return original name if no mapping found
++ return arch_name;
++}
+ // exported. Get architecture name
+ std::string rocblas_internal_get_arch_name()
+ {
+--
+2.53.0
+
diff --git a/0006-rocblas-generalize-finding-tensile-for-generics.patch b/0006-rocblas-generalize-finding-tensile-for-generics.patch
new file mode 100644
index 0000000..d8be7e8
--- /dev/null
+++ b/0006-rocblas-generalize-finding-tensile-for-generics.patch
@@ -0,0 +1,136 @@
+From 4bf4de5e52725e5d253eef646d770004ef9db772 Mon Sep 17 00:00:00 2001
+From: Tom Rix <Tom.Rix@amd.com>
+Date: Tue, 10 Mar 2026 07:06:47 -0700
+Subject: [PATCH 6/6] [rocblas] generalize finding tensile for generics
+
+If rocblas is built with ex/ gfx11-generic it should run on any
+gfx11XX gpu. So when finding the tensile library, check first
+the specific gpu, then the generic gpu.
+
+Signed-off-by: Tom Rix <Tom.Rix@amd.com>
+---
+ projects/rocblas/library/src/tensile_host.cpp | 85 ++++++++++---------
+ 1 file changed, 47 insertions(+), 38 deletions(-)
+
+diff --git a/projects/rocblas/library/src/tensile_host.cpp b/projects/rocblas/library/src/tensile_host.cpp
+index 1c9012f5d192..4587d498f94e 100644
+--- a/projects/rocblas/library/src/tensile_host.cpp
++++ b/projects/rocblas/library/src/tensile_host.cpp
+@@ -720,7 +720,10 @@ namespace
+ #endif
+
+ // The name of the current GPU platform
+- std::string processor = rocblas_internal_get_arch_name();
++ std::string specific_processor = rocblas_internal_get_arch_name();
++ std::string generic_processor = rocblas_internal_get_generic_arch_name();
++ std::string processors[2] = {specific_processor, generic_processor};
++ std::string processor;
+ // Get current xnack mode
+ std::string xnack = rocblas_internal_get_xnack_mode();
+
+@@ -806,59 +809,65 @@ namespace
+ return 0;
+ }();
+
+- path = base_path;
+- if(TestPath(path + "/" + processor))
+- path += "/" + processor;
++ // Loop over processors to find a valid Tensile library
++ // Only call rocblas_abort on the final processor
++ for(int i = 0; i < 2; ++i)
++ {
++ processor = processors[i];
++
++ path = base_path;
++ if(TestPath(path + "/" + processor))
++ path += "/" + processor;
+
+ #ifdef TENSILE_YAML
+- tensileLibraryPath = path + "/TensileLibrary_lazy_" + processor + ".yaml";
++ tensileLibraryPath = path + "/TensileLibrary_lazy_" + processor + ".yaml";
+ #else
+- tensileLibraryPath = path + "/TensileLibrary_lazy_" + processor + ".dat";
++ tensileLibraryPath = path + "/TensileLibrary_lazy_" + processor + ".dat";
+ #endif
+- if(!TestPath(tensileLibraryPath))
+- {
+-
++ if(TestPath(tensileLibraryPath)) {
++ tensile_lazy_load_enabled = true;
++ break;
++ }
+ #ifdef TENSILE_YAML
+- tensileLibraryPath = path + "/TensileLibrary_" + processor + ".yaml";
++ tensileLibraryPath = path + "/TensileLibrary_" + processor + ".yaml";
+ #else
+- tensileLibraryPath = path + "/TensileLibrary_" + processor + ".dat";
++ tensileLibraryPath = path + "/TensileLibrary_" + processor + ".dat";
+ #endif
+- if(!TestPath(tensileLibraryPath))
+- {
++ if(TestPath(tensileLibraryPath))
++ break;
++
+ #ifdef TENSILE_YAML
+- tensileLibraryPath = path + "/TensileLibrary.yaml";
++ tensileLibraryPath = path + "/TensileLibrary.yaml";
+ #else
+- tensileLibraryPath = path + "/TensileLibrary.dat";
++ tensileLibraryPath = path + "/TensileLibrary.dat";
+ #endif
+- if(!TestPath(tensileLibraryPath))
+- {
++ if(TestPath(tensileLibraryPath))
++ break;
++
+ #if ROCBLAS_TENSILE_SEPARATE_ARCH
+- rocblas_cerr << "\nrocBLAS error: Cannot read " << tensileLibraryPath
+- << ": " << strerror(errno) << " for GPU arch : " << processor
+- << std::endl;
++ rocblas_cerr << "\nrocBLAS error: Cannot read " << tensileLibraryPath
++ << ": " << strerror(errno) << " for GPU arch : " << processor
++ << std::endl;
+ #if ROCBLAS_TENSILE_LAZY_LOAD
+- std::regex fileMatcher(path + "/TensileLibrary_lazy.*");
++ std::regex fileMatcher(path + "/TensileLibrary_lazy.*");
+ #else
+- std::regex fileMatcher(path + "/TensileLibrary_gfx\\d+.dat");
++ std::regex fileMatcher(path + "/TensileLibrary_gfx\\d+.dat");
+ #endif
+- rocblas_cerr << " List of available TensileLibrary Files : " << std::endl;
+- for(auto& file_name : fs::directory_iterator(path))
+- {
+- if(std::regex_match(file_name.path().string(), fileMatcher))
+- {
+- rocblas_cerr << file_name << std::endl;
+- }
+- }
++ rocblas_cerr << " List of available TensileLibrary Files : " << std::endl;
++ for(auto& file_name : fs::directory_iterator(path))
++ {
++ if(std::regex_match(file_name.path().string(), fileMatcher))
++ {
++ rocblas_cerr << file_name << std::endl;
++ }
++ }
+ #else
+- rocblas_cerr << "\nrocBLAS error: Cannot read " << tensileLibraryPath
+- << ": " << strerror(errno) << std::endl;
++ rocblas_cerr << "\nrocBLAS error: Cannot read " << tensileLibraryPath
++ << ": " << strerror(errno) << std::endl;
+ #endif
+- rocblas_abort();
+- }
+- }
+- }
+- else
+- tensile_lazy_load_enabled = true;
++ if (i == 1)
++ rocblas_abort();
++ }
+
+ //Supports multi architecture configuration in lazy library loading mode
+ static int initialize_once = [&] {
+--
+2.53.0
+
diff --git a/rocblas.spec b/rocblas.spec
index af1da25..5cdc99c 100644
--- a/rocblas.spec
+++ b/rocblas.spec
@@ -19,16 +19,19 @@
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
#
-%bcond_with gitcommit
-%if %{with gitcommit}
-%global commit0 de5c1aebb641af098d9310a9fcca5591a7c066c8
-%global shortcommit0 %(c=%{commit0}; echo ${c:0:7})
-%global date0 20251015
-%endif
-
%global upstreamname rocblas
+
+%bcond_with preview
+%if %{with preview}
+%global rocm_release 7.11
+%global rocm_patch 0
+%global pkg_src therock-%{rocm_release}
+%else
%global rocm_release 7.2
%global rocm_patch 0
+%global pkg_src rocm-%{rocm_release}.%{rocm_patch}
+%endif
+
%global rocm_version %{rocm_release}.%{rocm_patch}
%bcond_with compat
@@ -158,27 +161,32 @@ Name: rocblas%{pkg_suffix}
Summary: BLAS implementation for ROCm
License: MIT AND BSD-3-Clause AND 0BSD
URL: https://github.com/ROCm/rocm-libraries
-
-%if %{with gitcommit}
-Version: git%{date0}.%{shortcommit0}
-Release: 3%{?dist}
-Source0: %{url}/archive/%{commit0}/rocm-libraries-%{shortcommit0}.tar.gz
-%else
Version: %{rocm_version}
-Release: 2%{?dist}
-Source0: %{url}/releases/download/rocm-%{version}/%{upstreamname}.tar.gz#/%{upstreamname}-%{version}.tar.gz
+%if %{with preview}
+Release: 0%{?dist}
+%else
+Release: 3%{?dist}
%endif
-Patch1: 0001-fixup-install-of-tensile-output.patch
+Source0: %{url}/releases/download/%{pkg_src}/%{upstreamname}.tar.gz#/%{upstreamname}-%{version}.tar.gz
+Source1: %{url}/releases/download/%{pkg_src}/tensile.tar.gz#/tensile-%{version}.tar.gz
-# Bundled tensile
-Source1: https://github.com/ROCmSoftwarePlatform/Tensile/archive/rocm-%{version}.tar.gz#/Tensile-%{version}.tar.gz
+%if %{with preview}
+Patch1: 0001-improve-the-warning-for-asm-caps-mismatches.patch
+Patch2: 0002-add-generic-gpu-targets.patch
+Patch3: 0003-improve-fallback-name-to-handle-generics.patch
+Patch4: 0004-generic-arches-need-a-solution-index.patch
+Patch5: 0005-rocblas-add-rocblas_internal_get_generic_arch_name.patch
+Patch6: 0006-rocblas-generalize-finding-tensile-for-generics.patch
+%else
+Patch1: 0001-fixup-install-of-tensile-output.patch
Patch101: 0001-tensile-fedora-gpus.patch
Patch102: 0001-tensile-gfx1153.patch
Patch103: 0001-tensile-set-default-paths.patch
Patch104: 0001-tensile-ignore-cache-check.patch
Patch105: 0001-tensile-add-cmake-arches.patch
Patch106: 0001-tensile-gfx1036.patch
+%endif
BuildRequires: cmake
BuildRequires: gcc-c++
@@ -336,24 +344,29 @@ Requires: diffutils
%endif
%prep
-%if %{with gitcommit}
-%setup -q -n rocm-libraries-%{commit0}
-cd projects/rocblas
-%patch -P1 -p1
-%else
%setup -q -n %{upstreamname}
+%if %{with preview}
+%patch -P5 -p3
+%patch -P6 -p3
+%else
%patch -P1 -p1
%endif
tar xf %{SOURCE1}
-mv Tensile-* Tensile
-cd Tensile
+cd tensile
+%if %{with preview}
+%patch -P1 -p3
+%patch -P2 -p3
+%patch -P3 -p3
+%patch -P4 -p3
+%else
%patch -P101 -p1
%patch -P102 -p1
%patch -P103 -p1
%patch -P104 -p1
%patch -P105 -p1
%patch -P106 -p1
+%endif
#Fix a few things:
chmod 755 Tensile/Configs/miopen/convert_cfg.py
@@ -384,8 +397,13 @@ sed -i -e '/rich/d' requirements.*
sed -i -e '/msgpack/d' requirements.*
# Generalize prefix
+%if %{with preview}
+sed -i -e 's@DEFAULT_ROCM_BIN_PATH_POSIX = Path("/opt/rocm/bin")@DEFAULT_ROCM_BIN_PATH_POSIX = Path("%{pkg_prefix}/bin")@' Tensile/Utilities/Toolchain.py
+sed -i -e 's@DEFAULT_ROCM_LLVM_BIN_PATH_POSIX = Path("/opt/rocm/lib/llvm/bin")@DEFAULT_ROCM_LLVM_BIN_PATH_POSIX = Path("%{rocmllvm_bindir}")@' Tensile/Utilities/Toolchain.py
+%else
sed -i -e 's@/usr/bin@%{pkg_prefix}/bin@' Tensile/Utilities/Toolchain.py
sed -i -e 's@/usr/lib64/rocm/llvm/bin@%{rocmllvm_bindir}@' Tensile/Utilities/Toolchain.py
+%endif
# Make sure hip/hip_runtime.h is found
sed -i -e 's@"-D__HIP_HCC_COMPAT_MODE__=1"@"-D__HIP_HCC_COMPAT_MODE__=1","-I%{pkg_prefix}/include"@' Tensile/BuildCommands/SourceCommands.py
@@ -415,7 +433,7 @@ sed -i -e 's@list( APPEND COMMON_LINK_LIBS "-lgfortran")@#list( APPEND COMMON_LI
%if %{with tensile}
%if %{with bundled_tensile}
-cd Tensile
+cd tensile
TL=$PWD
python3 setup.py install --root $TL
TP=${TL}/usr/lib/python%{python3_version}/site-packages/Tensile/
@@ -425,10 +443,6 @@ TP=`/usr/bin/TensileGetPath`
%endif
%endif
-%if %{with gitcommit}
-cd projects/rocblas
-%endif
-
CORES=`lscpu | grep 'Core(s)' | awk '{ print $4 }'`
if [ ${CORES}x = x ]; then
CORES=1
@@ -452,10 +466,6 @@ export HIPCC_LINK_FLAGS_APPEND=-fuse-ld=lld
%cmake_build
%install
-%if %{with gitcommit}
-cd projects/rocblas
-%endif
-
%cmake_install
# Extra license
@@ -478,13 +488,8 @@ export LD_LIBRARY_PATH=%{_vpath_builddir}/library/src:$LD_LIBRARY_PATH
%endif
%files -n %{rocblas_name}
-%if %{with gitcommit}
-%license projects/rocblas/LICENSE.md
-%doc projects/rocblas/README.md
-%else
%license LICENSE.md
%doc README.md
-%endif
%{pkg_prefix}/%{pkg_libdir}/librocblas.so.5{,.*}
%if %{with tensile}
%{pkg_prefix}/%{pkg_libdir}/rocblas/
@@ -501,6 +506,10 @@ export LD_LIBRARY_PATH=%{_vpath_builddir}/library/src:$LD_LIBRARY_PATH
%endif
%changelog
+* Sat Mar 7 2026 Tom Rix <Tom.Rix@amd.com> - 7.2.0-3
+- Change --with gitcommit to preview
+- Use rocm-libraries for tensile source
+
* Sun Feb 15 2026 Tom Rix <Tom.Rix@amd.com> - 7.2.0-2
- strip hsaco files
- make test optional
diff --git a/sources b/sources
index 6749db9..7c48b91 100644
--- a/sources
+++ b/sources
@@ -1,2 +1,3 @@
SHA512 (Tensile-7.2.0.tar.gz) = fc1946aa1c3ebddbdab02f6966d7ed08d937e17518d192b31a54d2084972188d8c71b8d1c58f0fd5d8455cc9a3e11414f1f7dbbfd284e0c90538264b9af2c4d0
SHA512 (rocblas-7.2.0.tar.gz) = 5301a8822c4d3b9ea4223ebe001a80522605d0b2634d11e824043026fe8b148c424c4ffaa4402133dcb28857363c273aa56caa3533b91b0b6147e0289350ca1f
+SHA512 (tensile-7.2.0.tar.gz) = 8b17ee9fc2c0998242928ee923d82f7125d551940af71afc3bcfee90b02e654f9715e84f2caf2dd720e0904e670930b7a9e014b929ebeae04608ba7a128532dd
reply other threads:[~2026-06-11 14:33 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=178118839140.1.11127389856776383527.rpms-rocblas-d4ebec97e49d@fedoraproject.org \
--to=tom.rix@amd.com \
--cc=git-commits@fedoraproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox