public inbox for git-commits@fedoraproject.org
help / color / mirror / Atom feed
From: Tom Rix <Tom.Rix@amd.com>
To: git-commits@fedoraproject.org
Subject: [rpms/rocblas] epel10: Add --with preview
Date: Thu, 11 Jun 2026 14:33:11 GMT	[thread overview]
Message-ID: <178118839140.1.11127389856776383527.rpms-rocblas-d4ebec97e49d@fedoraproject.org> (raw)

            A new commit has been pushed.

            Repo   : rpms/rocblas
            Branch : epel10
            Commit : d4ebec97e49d38d814eb50c8b83abc6cec93b00c
            Author : Tom Rix <Tom.Rix@amd.com>
            Date   : 2026-03-10T11:37:16-07:00
            Stats  : +986/-39 in 9 file(s)
            URL    : https://src.fedoraproject.org/rpms/rocblas/c/d4ebec97e49d38d814eb50c8b83abc6cec93b00c?branch=epel10

            Log:
            Add --with preview

Change tensile source to rocm-libraries

Signed-off-by: Tom Rix <Tom.Rix@amd.com>

---
diff --git a/.gitignore b/.gitignore
index ac175f0..eada8e5 100644
--- a/.gitignore
+++ b/.gitignore
@@ -16,3 +16,4 @@
 /Tensile-7.1.1.tar.gz
 /Tensile-7.2.0.tar.gz
 /rocblas-7.2.0.tar.gz
+/tensile-7.2.0.tar.gz

diff --git a/0001-improve-the-warning-for-asm-caps-mismatches.patch b/0001-improve-the-warning-for-asm-caps-mismatches.patch
new file mode 100644
index 0000000..66005bf
--- /dev/null
+++ b/0001-improve-the-warning-for-asm-caps-mismatches.patch
@@ -0,0 +1,43 @@
+From 22338f7f0aa80c41b04ff4075a9b39957228d219 Mon Sep 17 00:00:00 2001
+From: Tom Rix <Tom.Rix@amd.com>
+Date: Sun, 8 Mar 2026 10:48:50 -0700
+Subject: [PATCH 1/6] improve the warning for asm caps mismatches
+
+This change prints out the different keys/value pairt when there
+is a difference between the derrived and cached asm tables.
+
+Signed-off-by: Tom Rix <Tom.Rix@amd.com>
+---
+ shared/tensile/Tensile/Common.py | 9 +++++++++
+ 1 file changed, 9 insertions(+)
+
+diff --git a/shared/tensile/Tensile/Common.py b/shared/tensile/Tensile/Common.py
+index a7bbf0724a80..b97fa061327b 100644
+--- a/shared/tensile/Tensile/Common.py
++++ b/shared/tensile/Tensile/Common.py
+@@ -2010,6 +2010,14 @@ def locateExe( defaultPath, exeName ): # /opt/rocm/bin, hip-clang
+       return exePath
+   return None
+ 
++def PrintDiff(d1, d2):
++    keys = set(d1.keys() | d2.keys())
++    for key in keys:
++        v1 = d1.get(key)
++        v2 = d2.get(key)
++        if v1 != v2:
++            printWarning(f"{key}: {v1} != {v2}")
++
+ def GetAsmCaps(isaVersion: IsaVersion, hipVersion: SemanticVersion, cachedAsmCaps: Dict[IsaVersion, dict]) -> Dict[IsaVersion, dict]:
+   """ Determine assembler capabilities by testing short instructions sequences """
+   if globalParameters["AssemblerPath"] is not None:
+@@ -2132,6 +2140,7 @@ def GetAsmCaps(isaVersion: IsaVersion, hipVersion: SemanticVersion, cachedAsmCap
+         exitFlag = True
+       if exitFlag:
+         printWarning("Cached asm caps differ from derived asm caps for {}".format(isaVersion))
++        PrintDiff(derivedAsmCaps, cachedAsmCaps[isaVersion])
+     return derivedAsmCaps
+   else:
+     printWarning("Assembler not present, asm caps loaded from cache are unverified")
+-- 
+2.53.0
+

diff --git a/0002-add-generic-gpu-targets.patch b/0002-add-generic-gpu-targets.patch
new file mode 100644
index 0000000..68b8e28
--- /dev/null
+++ b/0002-add-generic-gpu-targets.patch
@@ -0,0 +1,591 @@
+From 60c8c0786b61e1ab2040f7b6d7b6c2b4b244c9e1 Mon Sep 17 00:00:00 2001
+From: Tom Rix <Tom.Rix@amd.com>
+Date: Sun, 8 Mar 2026 01:32:28 +0000
+Subject: [PATCH 2/6] add generic gpu targets
+
+To support generic gpu targets ex/ -DGPU_TARGETS=gfx11-generic.
+
+Tensile does not have support for every possible gpu target.  Instead
+of adding then piecement, provide support for all the generic targets.
+
+In Common.py overload int tuple for SupportedISA, where if the last
+value is negative, then this is a generic isa.
+Ex
+  (10,3,-1) -> gfx10-3-generic
+  (11,0,-1) -> gfx11-generic
+
+In AsmCaps, cut-n-paste generic tables from a close existing table.
+ex/ (10,3,0) was used of (10,3,-1).  Then fix the values based on
+the derrived vs cached warnings during a build.
+
+Add new mapping where appropriate.
+
+Signed-off-by: Tom Rix <Tom.Rix@amd.com>
+---
+ shared/tensile/Tensile/AsmCaps.py             | 264 ++++++++++++++++++
+ shared/tensile/Tensile/Common.py              |  57 +++-
+ .../cmake/TensileSupportedArchitectures.cmake |   9 +-
+ .../Source/lib/include/Tensile/AMDGPU.hpp     |  44 ++-
+ .../include/Tensile/PlaceholderLibrary.hpp    |  18 ++
+ 5 files changed, 375 insertions(+), 17 deletions(-)
+
+diff --git a/shared/tensile/Tensile/AsmCaps.py b/shared/tensile/Tensile/AsmCaps.py
+index 48eeec1f9a6c..58776e249b78 100644
+--- a/shared/tensile/Tensile/AsmCaps.py
++++ b/shared/tensile/Tensile/AsmCaps.py
+@@ -169,6 +169,50 @@ def getCapabilitiesCache(rocmVersion: NamedTuple) -> dict:
+                  'v_mov_b64': False,
+                  'v_pk_fma_f16': True,
+                  'v_pk_fmac_f16': False},
++     (9, 0, -1): {'HasAddLshl': True,
++                 'HasAtomicAdd': False,
++                 'HasDirectToLdsDest': False,
++                 'HasDirectToLdsNoDest': True,
++                 'HasExplicitCO': True,
++                 'HasExplicitNC': False,
++                 'HasGLCModifier': True,
++                 'HasNTModifier': False,
++                 'HasLshlOr': True,
++                 'HasMFMA': False,
++                 'HasMFMA_b8': False,
++                 'HasMFMA_bf16_1k': False,
++                 'HasMFMA_bf16_original': False,
++                 'HasMFMA_constSrc': False,
++                 'HasMFMA_f64': False,
++                 'HasMFMA_f8': False,
++                 'HasMFMA_i8_908': False,
++                 'HasMFMA_i8_940': False,
++                 'HasMFMA_vgpr': False,
++                 'HasMFMA_xf32': False,
++                 'HasSMulHi': True,
++                 'HasWMMA': False,
++                 'KernargPreloading': False,
++                 'MaxLgkmcnt': 15,
++                 'MaxVmcnt': 63,
++                 'SupportedISA': True,
++                 'SupportedSource': True,
++                 'VOP3v_dot4_i32_i8': False,
++                 'v_dot2_f32_f16': False,
++                 'v_dot2c_f32_f16': False,
++                 'v_dot4_i32_i8': False,
++                 'v_dot4c_i32_i8': False,
++                 'v_fma_f16': True,
++                 'v_fma_f32': True,
++                 'v_fma_f64': True,
++                 'v_fma_mix_f32': False,
++                 'v_fmac_f16': False,
++                 'v_fmac_f32': False,
++                 'v_mac_f16': True,
++                 'v_mac_f32': True,
++                 'v_mad_mix_f32': False,
++                 'v_mov_b64': False,
++                 'v_pk_fma_f16': True,
++                 'v_pk_fmac_f16': False},
+      (9, 0, 6): {'HasAddLshl': True,
+                  'HasAtomicAdd': False,
+                  'HasDirectToLdsDest': False,
+@@ -345,6 +389,50 @@ def getCapabilitiesCache(rocmVersion: NamedTuple) -> dict:
+                  'v_mov_b64': True,
+                  'v_pk_fma_f16': True,
+                  'v_pk_fmac_f16': False},
++     (9, 4, -1): {'HasAddLshl': True,
++                 'HasAtomicAdd': True,
++                 'HasDirectToLdsDest': False,
++                 'HasDirectToLdsNoDest': True,
++                 'HasExplicitCO': True,
++                 'HasExplicitNC': False,
++                 'HasGLCModifier': False,
++                 'HasNTModifier': True,
++                 'HasLshlOr': True,
++                 'HasMFMA': True,
++                 'HasMFMA_b8': False,
++                 'HasMFMA_bf16_1k': True,
++                 'HasMFMA_bf16_original': False,
++                 'HasMFMA_constSrc': True,
++                 'HasMFMA_f64': True,
++                 'HasMFMA_f8': False,
++                 'HasMFMA_i8_908': False,
++                 'HasMFMA_i8_940': True,
++                 'HasMFMA_vgpr': True,
++                 'HasMFMA_xf32': False,
++                 'HasSMulHi': True,
++                 'HasWMMA': False,
++                 'KernargPreloading': True,
++                 'MaxLgkmcnt': 15,
++                 'MaxVmcnt': 63,
++                 'SupportedISA': True,
++                 'SupportedSource': True,
++                 'VOP3v_dot4_i32_i8': True,
++                 'v_dot2_f32_f16': True,
++                 'v_dot2c_f32_f16': True,
++                 'v_dot4_i32_i8': False,
++                 'v_dot4c_i32_i8': True,
++                 'v_fma_f16': True,
++                 'v_fma_f32': True,
++                 'v_fma_f64': True,
++                 'v_fma_mix_f32': True,
++                 'v_fmac_f16': False,
++                 'v_fmac_f32': True,
++                 'v_mac_f16': True,
++                 'v_mac_f32': False,
++                 'v_mad_mix_f32': False,
++                 'v_mov_b64': True,
++                 'v_pk_fma_f16': True,
++                 'v_pk_fmac_f16': False},
+      (9, 5, 0): {'HasAddLshl': True,
+                  'HasAtomicAdd': True,
+                  'HasDirectToLdsDest': False,
+@@ -433,6 +521,50 @@ def getCapabilitiesCache(rocmVersion: NamedTuple) -> dict:
+                   'v_mov_b64': False,
+                   'v_pk_fma_f16': True,
+                   'v_pk_fmac_f16': False},
++     (10, 1, -1): {'HasAddLshl': True,
++                  'HasAtomicAdd': False,
++                  'HasDirectToLdsDest': False,
++                  'HasDirectToLdsNoDest': True,
++                  'HasExplicitCO': True,
++                  'HasExplicitNC': True,
++                  'HasGLCModifier': True,
++                  'HasNTModifier': False,
++                  'HasLshlOr': True,
++                  'HasMFMA': False,
++                  'HasMFMA_b8': False,
++                  'HasMFMA_bf16_1k': False,
++                  'HasMFMA_bf16_original': False,
++                  'HasMFMA_constSrc': False,
++                  'HasMFMA_f64': False,
++                  'HasMFMA_f8': False,
++                  'HasMFMA_i8_908': False,
++                  'HasMFMA_i8_940': False,
++                  'HasMFMA_vgpr': False,
++                  'HasMFMA_xf32': False,
++                  'HasSMulHi': True,
++                  'HasWMMA': False,
++                  'KernargPreloading': False,
++                  'MaxLgkmcnt': 15,
++                  'MaxVmcnt': 63,
++                  'SupportedISA': True,
++                  'SupportedSource': True,
++                  'VOP3v_dot4_i32_i8': False,
++                  'v_dot2_f32_f16': False,
++                  'v_dot2c_f32_f16': False,
++                  'v_dot4_i32_i8': False,
++                  'v_dot4c_i32_i8': False,
++                  'v_fma_f16': True,
++                  'v_fma_f32': True,
++                  'v_fma_f64': True,
++                  'v_fma_mix_f32': True,
++                  'v_fmac_f16': False,
++                  'v_fmac_f32': True,
++                  'v_mac_f16': False,
++                  'v_mac_f32': True,
++                  'v_mad_mix_f32': False,
++                  'v_mov_b64': False,
++                  'v_pk_fma_f16': True,
++                  'v_pk_fmac_f16': False},
+      (10, 1, 1): {'HasAddLshl': True,
+                   'HasAtomicAdd': False,
+                   'HasDirectToLdsDest': False,
+@@ -565,6 +697,50 @@ def getCapabilitiesCache(rocmVersion: NamedTuple) -> dict:
+                   'v_mov_b64': False,
+                   'v_pk_fma_f16': True,
+                   'v_pk_fmac_f16': False},
++     (10, 3, -1): {'HasAddLshl': True,
++                  'HasAtomicAdd': False,
++                  'HasDirectToLdsDest': False,
++                  'HasDirectToLdsNoDest': True,
++                  'HasExplicitCO': True,
++                  'HasExplicitNC': True,
++                  'HasGLCModifier': True,
++                  'HasNTModifier': False,
++                  'HasLshlOr': True,
++                  'HasMFMA': False,
++                  'HasMFMA_b8': False,
++                  'HasMFMA_bf16_1k': False,
++                  'HasMFMA_bf16_original': False,
++                  'HasMFMA_constSrc': False,
++                  'HasMFMA_f64': False,
++                  'HasMFMA_f8': False,
++                  'HasMFMA_i8_908': False,
++                  'HasMFMA_i8_940': False,
++                  'HasMFMA_vgpr': False,
++                  'HasMFMA_xf32': False,
++                  'HasSMulHi': True,
++                  'HasWMMA': False,
++                  'KernargPreloading': False,
++                  'MaxLgkmcnt': 15,
++                  'MaxVmcnt': 63,
++                  'SupportedISA': True,
++                  'SupportedSource': True,
++                  'VOP3v_dot4_i32_i8': True,
++                  'v_dot2_f32_f16': True,
++                  'v_dot2c_f32_f16': True,
++                  'v_dot4_i32_i8': False,
++                  'v_dot4c_i32_i8': True,
++                  'v_fma_f16': True,
++                  'v_fma_f32': True,
++                  'v_fma_f64': True,
++                  'v_fma_mix_f32': True,
++                  'v_fmac_f16': False,
++                  'v_fmac_f32': True,
++                  'v_mac_f16': False,
++                  'v_mac_f32': False,
++                  'v_mad_mix_f32': False,
++                  'v_mov_b64': False,
++                  'v_pk_fma_f16': True,
++                  'v_pk_fmac_f16': False},
+      (10, 3, 1): {'HasAddLshl': True,
+                   'HasAtomicAdd': False,
+                   'HasDirectToLdsDest': False,
+@@ -873,6 +1049,50 @@ def getCapabilitiesCache(rocmVersion: NamedTuple) -> dict:
+                   'v_mov_b64': False,
+                   'v_pk_fma_f16': True,
+                   'v_pk_fmac_f16': False},
++     (11, 0, -1): {'HasAddLshl': True,
++                  'HasAtomicAdd': True,
++                  'HasDirectToLdsDest': False,
++                  'HasDirectToLdsNoDest': False,
++                  'HasExplicitCO': True,
++                  'HasExplicitNC': True,
++                  'HasGLCModifier': True,
++                  'HasNTModifier': False,
++                  'HasLshlOr': True,
++                  'HasMFMA': False,
++                  'HasMFMA_b8': False,
++                  'HasMFMA_bf16_1k': False,
++                  'HasMFMA_bf16_original': False,
++                  'HasMFMA_constSrc': False,
++                  'HasMFMA_f64': False,
++                  'HasMFMA_f8': False,
++                  'HasMFMA_i8_908': False,
++                  'HasMFMA_i8_940': False,
++                  'HasMFMA_vgpr': False,
++                  'HasMFMA_xf32': False,
++                  'HasSMulHi': True,
++                  'HasWMMA': True,
++                  'KernargPreloading': False,
++                  'MaxLgkmcnt': 15,
++                  'MaxVmcnt': 63,
++                  'SupportedISA': True,
++                  'SupportedSource': True,
++                  'VOP3v_dot4_i32_i8': True,
++                  'v_dot2_f32_f16': True,
++                  'v_dot2c_f32_f16': True,
++                  'v_dot4_i32_i8': False,
++                  'v_dot4c_i32_i8': False,
++                  'v_fma_f16': True,
++                  'v_fma_f32': True,
++                  'v_fma_f64': True,
++                  'v_fma_mix_f32': True,
++                  'v_fmac_f16': False,
++                  'v_fmac_f32': True,
++                  'v_mac_f16': False,
++                  'v_mac_f32': False,
++                  'v_mad_mix_f32': False,
++                  'v_mov_b64': False,
++                  'v_pk_fma_f16': True,
++                  'v_pk_fmac_f16': False},
+      (11, 0, 1): {'HasAddLshl': True,
+                   'HasAtomicAdd': True,
+                   'HasDirectToLdsDest': False,
+@@ -1225,6 +1445,50 @@ def getCapabilitiesCache(rocmVersion: NamedTuple) -> dict:
+                   'v_mov_b64': False,
+                   'v_pk_fma_f16': True,
+                   'v_pk_fmac_f16': False},
++     (12, 0, -1): {'HasAddLshl': True,
++                  'HasAtomicAdd': False,
++                  'HasDirectToLdsDest': False,
++                  'HasDirectToLdsNoDest': False,
++                  'HasExplicitCO': True,
++                  'HasExplicitNC': True,
++                  'HasGLCModifier': False,
++                  'HasNTModifier': False,
++                  'HasLshlOr': True,
++                  'HasMFMA': False,
++                  'HasMFMA_b8': False,
++                  'HasMFMA_bf16_1k': False,
++                  'HasMFMA_bf16_original': False,
++                  'HasMFMA_constSrc': False,
++                  'HasMFMA_f64': False,
++                  'HasMFMA_f8': False,
++                  'HasMFMA_i8_908': False,
++                  'HasMFMA_i8_940': False,
++                  'HasMFMA_vgpr': False,
++                  'HasMFMA_xf32': False,
++                  'HasSMulHi': True,
++                  'HasWMMA': False,
++                  'KernargPreloading': False,
++                  'MaxLgkmcnt': 15,
++                  'MaxVmcnt': 63,
++                  'SupportedISA': True,
++                  'SupportedSource': True,
++                  'VOP3v_dot4_i32_i8': True,
++                  'v_dot2_f32_f16': True,
++                  'v_dot2c_f32_f16': False,
++                  'v_dot4_i32_i8': False,
++                  'v_dot4c_i32_i8': False,
++                  'v_fma_f16': True,
++                  'v_fma_f32': True,
++                  'v_fma_f64': True,
++                  'v_fma_mix_f32': True,
++                  'v_fmac_f16': False,
++                  'v_fmac_f32': True,
++                  'v_mac_f16': False,
++                  'v_mac_f32': False,
++                  'v_mad_mix_f32': False,
++                  'v_mov_b64': False,
++                  'v_pk_fma_f16': True,
++                  'v_pk_fmac_f16': False},
+      (12, 0, 1): {'HasAddLshl': True,
+                   'HasAtomicAdd': False,
+                   'HasDirectToLdsDest': False,
+diff --git a/shared/tensile/Tensile/Common.py b/shared/tensile/Tensile/Common.py
+index b97fa061327b..9a2c399fad1b 100644
+--- a/shared/tensile/Tensile/Common.py
++++ b/shared/tensile/Tensile/Common.py
+@@ -246,12 +246,12 @@ globalParameters["NumMergedFiles"] = 1            # The number of files that ker
+ 
+ globalParameters["MaxFileName"] = 64              # If a file name would be longer than this, shorten it with a hash.
+ globalParameters["SupportedISA"] = [(8,0,3),
+-                                    (9,0,0), (9,0,6), (9,0,8), (9,0,10),
+-                                    (9,4,2), (9,5,0),
+-                                    (10,1,0), (10,1,1), (10,1,2), (10,3,0), (10,3,1), (10,3,2), (10,3,3), (10,3,4), (10,3,5), (10,3,6),
+-                                    (11,0,0), (11,0,1), (11,0,2), (11,0,3),
++                                    (9,0,0), (9,0,6), (9,0,8), (9,0,10), (9,0,-1),
++                                    (9,4,2), (9,4,-1), (9,5,0),
++                                    (10,1,0), (10,1,1), (10,1,2), (10,1,-1), (10,3,0), (10,3,1), (10,3,2), (10,3,3), (10,3,4), (10,3,5), (10,3,6), (10,3,-1),
++                                    (11,0,0), (11,0,1), (11,0,2), (11,0,3), (11,0,-1),
+                                     (11,5,0), (11,5,1), (11,5,2), (11,5,3),
+-                                    (12,0,0), (12,0,1)] # assembly kernels writer supports these architectures
++                                    (12,0,0), (12,0,1), (12,0,-1)] # assembly kernels writer supports these architectures
+ 
+ globalParameters["KeepBuildTmp"] = True                           # Do not remove build artifacts during the build process or build_tmp after build completes
+ globalParameters["GenerateManifestAndExit"] = False               # Output manifest file with list of expected library objects and exit
+@@ -320,15 +320,15 @@ architectureMap = {
+   'gfx803':'r9nano', 'gfx900':'vega10', 'gfx900:xnack-':'vega10',
+   'gfx906':'vega20', 'gfx906:xnack+':'vega20', 'gfx906:xnack-':'vega20',
+   'gfx908':'arcturus','gfx908:xnack+':'arcturus', 'gfx908:xnack-':'arcturus',
+-  'gfx90a':'aldebaran', 'gfx90a:xnack+':'aldebaran', 'gfx90a:xnack-':'aldebaran',
+-  'gfx942':'aquavanjaram942', 'gfx942:xnack+':'aquavanjaram942', 'gfx942:xnack-':'aquavanjaram942',
++  'gfx90a':'aldebaran', 'gfx90a:xnack+':'aldebaran', 'gfx90a:xnack-':'aldebaran', 'gfx9-generic':'gfx9-generic',
++  'gfx942':'aquavanjaram942', 'gfx942:xnack+':'aquavanjaram942', 'gfx942:xnack-':'aquavanjaram942', 'gfx9-4-generic':'gfx9-4-generic',
+   'gfx950':'gfx950', 'gfx950:xnack+':'gfx950', 'gfx950:xnack-':'gfx950',
+-  'gfx1010':'navi10', 'gfx1011':'navi12', 'gfx1012':'navi14',
+-  'gfx1030':'navi21', 'gfx1031':'navi22', 'gfx1032':'navi23', 'gfx1033':'van gogh', 'gfx1034':'navi24', 'gfx1035':'rembrandt', 'gfx1036':'raphael',
+-  'gfx1100':'navi31', 'gfx1101':'navi32', 'gfx1102':'navi33', 'gfx1103':'gfx1103',
++  'gfx1010':'navi10', 'gfx1011':'navi12', 'gfx1012':'navi14', 'gfx10-1-generic':'gfx10-1-generic',
++  'gfx1030':'navi21', 'gfx1031':'navi22', 'gfx1032':'navi23', 'gfx1033':'van gogh', 'gfx1034':'navi24', 'gfx1035':'rembrandt', 'gfx1036':'raphael', 'gfx10-3-generic':'gfx10-3-generic',
++  'gfx1100':'navi31', 'gfx1101':'navi32', 'gfx1102':'navi33', 'gfx1103':'gfx1103', 'gfx11-generic':'gfx11-generic',
+   'gfx1150':'strixpoint', 'gfx1151':'strixhalo', 'gfx1152':'gfx1152', 'gfx1153':'gfx1153',
+   'gfx1200':'gfx1200',
+-  'gfx1201':'gfx1201'
++  'gfx1201':'gfx1201', 'gfx12-generic':'gfx12-generic',
+ }
+ 
+ def getArchitectureName(gfxName: str) -> Optional[str]:
+@@ -2201,6 +2201,21 @@ def tryAssembler(isaVersion, asmString, debug=False, *options):
+ 
+ def gfxArch(name: str) -> Optional[IsaVersion]:
+     import re
++
++    # Handle special case for generic architectures like 'gfx10-3-generic'
++    generic_match = re.search(r'gfx([0-9]+)-([0-9]+)-generic', name)
++    if generic_match:
++        major = int(generic_match.group(1))
++        minor = int(generic_match.group(2))
++        return (major, minor, -1)  # step=-1 to indicate generic
++
++    # Handle special case for generic architectures like 'gfx11-generic'
++    generic_match = re.search(r'gfx([0-9]+)-generic', name)
++    if generic_match:
++        major = int(generic_match.group(1))
++        return (major, 0, -1)  # step=-1 to indicate generic, minor=0
++
++    # Handle regular architectures like 'gfx900', 'gfx803' etc.
+     match = re.search(r'gfx([0-9a-fA-F]{3,})', name)
+     if not match: return None
+ 
+@@ -2219,11 +2234,23 @@ def gfxArch(name: str) -> Optional[IsaVersion]:
+     return rv
+ 
+ def gfxName(arch):
+-    # convert last digit to hex because reasons
+-    name = str(arch[0]) + str(arch[1]) + ('%x' % arch[2])
++    # If arch[2] is negative, this is a generic target
++    if arch[2] < 0:
++        if arch[0] == 9:
++            if arch[1] == 4:
++                name = str(arch[0]) + '-' + str(arch[1]) + '-generic'
++            else:
++                name = str(arch[0]) + '-generic'
++        elif arch[0] == 10:
++            name = str(arch[0]) + '-' + str(arch[1]) + '-generic'
++        else:
++            name = str(arch[0]) + '-generic'
++    else:
++        # The normal case
++        # convert last digit to hex because reasons
++        name = str(arch[0]) + str(arch[1]) + ('%x' % arch[2])
+     return 'gfx' + ''.join(map(str,name))
+ 
+-
+ def detectIsaWindows(output):
+     i = 0
+     for line in output:
+@@ -2475,7 +2502,7 @@ def assignGlobalParameters( config, capabilitiesCache: Optional[dict] = None ):
+     if os.name == "nt":
+       globalParameters["CurrentISA"] = (9,0,6)
+       printWarning("Failed to detect ISA so forcing (gfx906) on windows")
+-  isasWithDisabledHWMonitor = ((9,4,2), (9,5,0), (11,0,0), (11,0,1), (11,0,2), (11,0,3), (11,5,0), (11,5,1), (11,5,2), (11,5,3), (12,0,0), (12,0,1))
++  isasWithDisabledHWMonitor = ((9,0,-1), (9,4,2), (9,4,-1), (9,5,0), (10,1,-1), (10,3,-1), (11,0,0), (11,0,1), (11,0,2), (11,0,3), (11,5,0), (11,5,1), (11,5,2), (11,5,3), (11,0,-1), (12,0,0), (12,0,1), (12,0,-1))
+   if globalParameters["CurrentISA"] in isasWithDisabledHWMonitor:
+     isaString = ', '.join(map(gfxName, isasWithDisabledHWMonitor))
+     printWarning(f"HardwareMonitor currently disabled for {isaString}")
+diff --git a/shared/tensile/Tensile/Source/cmake/TensileSupportedArchitectures.cmake b/shared/tensile/Tensile/Source/cmake/TensileSupportedArchitectures.cmake
+index a1fb7166cf63..5f3e2d54a003 100644
+--- a/shared/tensile/Tensile/Source/cmake/TensileSupportedArchitectures.cmake
++++ b/shared/tensile/Tensile/Source/cmake/TensileSupportedArchitectures.cmake
+@@ -35,11 +35,14 @@ if(NOT BUILD_ADDRESS_SANITIZER)
+         "gfx906"
+         "gfx908"
+         "gfx90a"
++	"gfx9-generic"
+         "gfx942"
++	"gfx9-4-generic"
+         "gfx950"
+         "gfx1010"
+         "gfx1011"
+         "gfx1012"
++	"gfx10-1-generic"
+         "gfx1030"
+         "gfx1031"
+         "gfx1032"
+@@ -47,6 +50,7 @@ if(NOT BUILD_ADDRESS_SANITIZER)
+         "gfx1034"
+         "gfx1035"
+         "gfx1036"
++        "gfx10-3-generic"
+         "gfx1100"
+         "gfx1101"
+         "gfx1102"
+@@ -55,8 +59,11 @@ if(NOT BUILD_ADDRESS_SANITIZER)
+         "gfx1151"
+         "gfx1152"
+         "gfx1153"
++        "gfx11-generic"
+         "gfx1200"
+-        "gfx1201")
++        "gfx1201"
++	"gfx12-generic"
++      )
+ 
+     set(SUPPORTED_ARCHITECTURES ${BASE_ARCHITECTURES})
+     list(APPEND SUPPORTED_ARCHITECTURES
+diff --git a/shared/tensile/Tensile/Source/lib/include/Tensile/AMDGPU.hpp b/shared/tensile/Tensile/Source/lib/include/Tensile/AMDGPU.hpp
+index 1d22bfe712da..be9d5a78c077 100644
+--- a/shared/tensile/Tensile/Source/lib/include/Tensile/AMDGPU.hpp
++++ b/shared/tensile/Tensile/Source/lib/include/Tensile/AMDGPU.hpp
+@@ -81,7 +81,13 @@ namespace Tensile
+             gfx1152 = 1152,
+             gfx1153 = 1153,
+             gfx1200 = 1200,
+-            gfx1201 = 1201
++            gfx1201 = 1201,
++	    gfx9_generic = -900,
++	    gfx9_4_generic = -940,
++	    gfx10_1_generic = -1010,
++	    gfx10_3_generic = -1030,
++	    gfx11_generic = -1100,
++	    gfx12_generic = -1200,
+         };
+ 
+         static std::string toString(Processor p)
+@@ -142,6 +148,18 @@ namespace Tensile
+                 return "gfx1200";
+             case AMDGPU::Processor::gfx1201:
+                 return "gfx1201";
++	    case AMDGPU::Processor::gfx9_generic:
++                return "gfx9-generic";
++	    case AMDGPU::Processor::gfx9_4_generic:
++                return "gfx9-4-generic";
++	    case AMDGPU::Processor::gfx10_1_generic:
++                return "gfx10-1-generic";
++	    case AMDGPU::Processor::gfx10_3_generic:
++                return "gfx10-3-generic";
++	    case AMDGPU::Processor::gfx11_generic:
++                return "gfx11-generic";
++	    case AMDGPU::Processor::gfx12_generic:
++                return "gfx12-generic";
+             }
+             return "";
+         }
+@@ -256,6 +274,30 @@ namespace Tensile
+             {
+                 return AMDGPU::Processor::gfx1201;
+             }
++	    else if(deviceString.find("gfx9-generic") != std::string::npos)
++            {
++                return AMDGPU::Processor::gfx9_generic;
++            }
++	    else if(deviceString.find("gfx9-4-generic") != std::string::npos)
++            {
++                return AMDGPU::Processor::gfx9_4_generic;
++            }
++	    else if(deviceString.find("gfx10-1-generic") != std::string::npos)
++            {
++                return AMDGPU::Processor::gfx10_1_generic;
++            }
++	    else if(deviceString.find("gfx10-3-generic") != std::string::npos)
++            {
++                return AMDGPU::Processor::gfx10_3_generic;
++            }
++	    else if(deviceString.find("gfx11-generic") != std::string::npos)
++            {
++                return AMDGPU::Processor::gfx11_generic;
++            }
++	    else if(deviceString.find("gfx12-generic") != std::string::npos)
++            {
++                return AMDGPU::Processor::gfx12_generic;
++            }
+             else
+             {
+                 return static_cast<AMDGPU::Processor>(0);
+diff --git a/shared/tensile/Tensile/Source/lib/include/Tensile/PlaceholderLibrary.hpp b/shared/tensile/Tensile/Source/lib/include/Tensile/PlaceholderLibrary.hpp
+index a9da044e8f39..2f8b18779936 100644
+--- a/shared/tensile/Tensile/Source/lib/include/Tensile/PlaceholderLibrary.hpp
++++ b/shared/tensile/Tensile/Source/lib/include/Tensile/PlaceholderLibrary.hpp
+@@ -66,6 +66,12 @@ namespace Tensile
+         gfx1153,
+         gfx1200,
+         gfx1201,
++	gfx9_generic,
++	gfx9_4_generic,
++	gfx10_1_generic,
++	gfx10_3_generic,
++	gfx11_generic,
++	gfx12_generic,
+         All
+     };
+ 
+@@ -130,6 +136,18 @@ namespace Tensile
+             return "TensileLibrary_*_gfx1200";
+         case LazyLoadingInit::gfx1201:
+             return "TensileLibrary_*_gfx1201";
++	case LazyLoadingInit::gfx9_generic:
++            return "TensileLibrary_*_gfx9-generic";
++	case LazyLoadingInit::gfx9_4_generic:
++            return "TensileLibrary_*_gfx9-4-generic";
++	case LazyLoadingInit::gfx10_1_generic:
++            return "TensileLibrary_*_gfx10-1-generic";
++	case LazyLoadingInit::gfx10_3_generic:
++            return "TensileLibrary_*_gfx10-3-generic";
++    	case LazyLoadingInit::gfx11_generic:
++            return "TensileLibrary_*_gfx11-generic";
++	case LazyLoadingInit::gfx12_generic:
++            return "TensileLibrary_*_gfx12-generic";
+         case LazyLoadingInit::None:
+             return "";
+         }
+-- 
+2.53.0
+

diff --git a/0003-improve-fallback-name-to-handle-generics.patch b/0003-improve-fallback-name-to-handle-generics.patch
new file mode 100644
index 0000000..68859a0
--- /dev/null
+++ b/0003-improve-fallback-name-to-handle-generics.patch
@@ -0,0 +1,32 @@
+From 6f042a916612aca518254d5870590d15ec7a16e6 Mon Sep 17 00:00:00 2001
+From: Tom Rix <Tom.Rix@amd.com>
+Date: Sun, 8 Mar 2026 13:38:28 -0700
+Subject: [PATCH 3/6] improve fallback name to handle generics
+
+The archName can be of the form gfx90a-xnack{+,-} and this function
+determines the fallback is gfx90a.  However when the archName is
+a generic, ex gfx11-generic, the entire name must be used.  So
+check if the name ends with -generic and skip splitting.
+
+Signed-off-by: Tom Rix <Tom.Rix@amd.com>
+---
+ shared/tensile/Tensile/TensileCreateLibrary.py | 3 ++-
+ 1 file changed, 2 insertions(+), 1 deletion(-)
+
+diff --git a/shared/tensile/Tensile/TensileCreateLibrary.py b/shared/tensile/Tensile/TensileCreateLibrary.py
+index 543b0379c41e..eb7147a4fd8a 100644
+--- a/shared/tensile/Tensile/TensileCreateLibrary.py
++++ b/shared/tensile/Tensile/TensileCreateLibrary.py
+@@ -962,7 +962,8 @@ def addFallback(masterLibraries: Dict[str, MasterSolutionLibrary]) -> None:
+             value.insert(masterLibraries["fallback"])
+ 
+     for archName in archs:
+-        archName = archName.split("-", 1)[0]
++        if not archName.endswith("-generic"):
++            archName = archName.split("-", 1)[0]
+         if archName not in masterLibraries:
+             tPrint(1, "Using fallback for arch: " + archName)
+             masterLibraries[archName] = masterLibraries["fallback"]
+-- 
+2.53.0
+

diff --git a/0004-generic-arches-need-a-solution-index.patch b/0004-generic-arches-need-a-solution-index.patch
new file mode 100644
index 0000000..be1b231
--- /dev/null
+++ b/0004-generic-arches-need-a-solution-index.patch
@@ -0,0 +1,45 @@
+From 71f280ea73630c0453fda896a36d0b3092b95aed Mon Sep 17 00:00:00 2001
+From: Tom Rix <Tom.Rix@amd.com>
+Date: Sun, 8 Mar 2026 16:21:07 -0700
+Subject: [PATCH 4/6] generic arches need a solution index
+
+So there is no overlap with the regular gpu indecies, pick
+a shift value that does not overlap.
+
+(9 << 29) >> 18 = 18432
+
+Signed-off-by: Tom Rix <Tom.Rix@amd.com>
+---
+ shared/tensile/Tensile/SolutionLibrary.py | 9 +++++++--
+ 1 file changed, 7 insertions(+), 2 deletions(-)
+
+diff --git a/shared/tensile/Tensile/SolutionLibrary.py b/shared/tensile/Tensile/SolutionLibrary.py
+index 0c7b6428d624..e7c4b7457737 100644
+--- a/shared/tensile/Tensile/SolutionLibrary.py
++++ b/shared/tensile/Tensile/SolutionLibrary.py
+@@ -255,7 +255,7 @@ class MasterSolutionLibrary:
+         """Maps hex characters from gfx name to an index.
+ 
+         Given a gfx name of the form gfx[0-9a-f]*, map the characters following
+-        gfx from hex to int and left shift the integer by 18.
++        gfx from hex to int and left shift the integer by 18 (or 29 for generic architectures).
+ 
+         Args:
+             architectureName: The gfx name (or fallback).
+@@ -273,7 +273,12 @@ class MasterSolutionLibrary:
+             archString = re.search('(?<=gfx)[0-9a-f]*', architectureName)
+             if archString is not None:
+                 archLiteral = archString.group(0)
+-                archval = (int(archLiteral, 16) << 18)
++                # Use left shift of 29 for generic architectures, 18 otherwise
++                if architectureName.endswith("-generic"):
++                    shift_bits = 29
++                else:
++                    shift_bits = 18
++                archval = (int(archLiteral, 16) << shift_bits)
+         # Check for duplicate architecture values
+         if archval >= 0 and not archval in cls.ArchitectureSet:
+             cls.ArchitectureSet.add(archval)
+-- 
+2.53.0
+

diff --git a/0005-rocblas-add-rocblas_internal_get_generic_arch_name.patch b/0005-rocblas-add-rocblas_internal_get_generic_arch_name.patch
new file mode 100644
index 0000000..19d3b70
--- /dev/null
+++ b/0005-rocblas-add-rocblas_internal_get_generic_arch_name.patch
@@ -0,0 +1,89 @@
+From 8926fb0fca00d1ff859682b3df91243cff650425 Mon Sep 17 00:00:00 2001
+From: Tom Rix <Tom.Rix@amd.com>
+Date: Mon, 9 Mar 2026 18:15:43 -0700
+Subject: [PATCH 5/6] [rocblas] add rocblas_internal_get_generic_arch_name
+
+A function similar to rocblas_internal_get_arch_name,
+returns the generic name for the arch.
+
+Signed-off-by: Tom Rix <Tom.Rix@amd.com>
+---
+ .../rocblas/library/src/include/utility.hpp   |  3 ++
+ .../rocblas/library/src/rocblas_auxiliary.cpp | 47 +++++++++++++++++++
+ 2 files changed, 50 insertions(+)
+
+diff --git a/projects/rocblas/library/src/include/utility.hpp b/projects/rocblas/library/src/include/utility.hpp
+index bb4212f78630..966958c9aca4 100644
+--- a/projects/rocblas/library/src/include/utility.hpp
++++ b/projects/rocblas/library/src/include/utility.hpp
+@@ -800,6 +800,9 @@ bool rocblas_internal_tensile_supports_ldc_ne_ldd(rocblas_handle handle);
+ // We assume true if the value is between 942 to 1000
+ ROCBLAS_INTERNAL_EXPORT bool rocblas_internal_tensile_supports_xdl_math_op(rocblas_math_mode mode);
+ 
++// for internal use
++ROCBLAS_INTERNAL_EXPORT std::string rocblas_internal_get_generic_arch_name();
++
+ // for internal use during testing, fetch arch name
+ ROCBLAS_INTERNAL_EXPORT std::string rocblas_internal_get_arch_name();
+ 
+diff --git a/projects/rocblas/library/src/rocblas_auxiliary.cpp b/projects/rocblas/library/src/rocblas_auxiliary.cpp
+index 57c24a9f519d..3f7c375eefc4 100644
+--- a/projects/rocblas/library/src/rocblas_auxiliary.cpp
++++ b/projects/rocblas/library/src/rocblas_auxiliary.cpp
+@@ -917,6 +917,53 @@ bool rocblas_internal_tensile_supports_xdl_math_op(rocblas_math_mode mode)
+     return (deviceString.find("gfx942") != std::string::npos);
+ }
+ 
++std::string rocblas_internal_get_generic_arch_name()
++{
++  std::string arch_name = rocblas_internal_get_arch_name();
++  // Map specific architecture names to generic names
++  static const std::map<std::string, std::string> arch_map = {
++    {"gfx900", "gfx9-generic"},
++    {"gfx902", "gfx9-generic"},
++    {"gfx904", "gfx9-generic"},
++    {"gfx906", "gfx9-generic"},
++    {"gfx908", "gfx9-generic"},
++    {"gfx909", "gfx9-generic"},
++    {"gfx90a", "gfx9-generic"},
++    {"gfx940", "gfx9-4-generic"},
++    {"gfx941", "gfx9-4-generic"},
++    {"gfx942", "gfx9-4-generic"},
++    {"gfx1010", "gfx10-1-generic"},
++    {"gfx1011", "gfx10-1-generic"},
++    {"gfx1012", "gfx10-1-generic"},
++    {"gfx1013", "gfx10-1-generic"},
++    {"gfx1030", "gfx10-3-generic"},
++    {"gfx1031", "gfx10-3-generic"},
++    {"gfx1032", "gfx10-3-generic"},
++    {"gfx1033", "gfx10-3-generic"},
++    {"gfx1034", "gfx10-3-generic"},
++    {"gfx1035", "gfx10-3-generic"},
++    {"gfx1036", "gfx10-3-generic"},
++    {"gfx1100", "gfx11-generic"},
++    {"gfx1101", "gfx11-generic"},
++    {"gfx1102", "gfx11-generic"},
++    {"gfx1103", "gfx11-generic"},
++    {"gfx1150", "gfx11-generic"},
++    {"gfx1151", "gfx11-generic"},
++    {"gfx1152", "gfx11-generic"},
++    {"gfx1153", "gfx11-generic"},
++    {"gfx1200", "gfx12-generic"},
++    {"gfx1201", "gfx12-generic"},
++    {"gfx1250", "gfx12-generic"},
++    {"gfx1251", "gfx12-generic"}
++  };
++
++  auto it = arch_map.find(arch_name);
++  if(it != arch_map.end())
++    return it->second;
++
++  // Return original name if no mapping found
++  return arch_name;
++}
+ // exported. Get architecture name
+ std::string rocblas_internal_get_arch_name()
+ {
+-- 
+2.53.0
+

diff --git a/0006-rocblas-generalize-finding-tensile-for-generics.patch b/0006-rocblas-generalize-finding-tensile-for-generics.patch
new file mode 100644
index 0000000..d8be7e8
--- /dev/null
+++ b/0006-rocblas-generalize-finding-tensile-for-generics.patch
@@ -0,0 +1,136 @@
+From 4bf4de5e52725e5d253eef646d770004ef9db772 Mon Sep 17 00:00:00 2001
+From: Tom Rix <Tom.Rix@amd.com>
+Date: Tue, 10 Mar 2026 07:06:47 -0700
+Subject: [PATCH 6/6] [rocblas] generalize finding tensile for generics
+
+If rocblas is built with ex/ gfx11-generic it should run on any
+gfx11XX gpu.  So when finding the tensile library, check first
+the specific gpu, then the generic gpu.
+
+Signed-off-by: Tom Rix <Tom.Rix@amd.com>
+---
+ projects/rocblas/library/src/tensile_host.cpp | 85 ++++++++++---------
+ 1 file changed, 47 insertions(+), 38 deletions(-)
+
+diff --git a/projects/rocblas/library/src/tensile_host.cpp b/projects/rocblas/library/src/tensile_host.cpp
+index 1c9012f5d192..4587d498f94e 100644
+--- a/projects/rocblas/library/src/tensile_host.cpp
++++ b/projects/rocblas/library/src/tensile_host.cpp
+@@ -720,7 +720,10 @@ namespace
+ #endif
+ 
+             // The name of the current GPU platform
+-            std::string processor = rocblas_internal_get_arch_name();
++	    std::string specific_processor = rocblas_internal_get_arch_name();
++	    std::string generic_processor = rocblas_internal_get_generic_arch_name();
++	    std::string processors[2] = {specific_processor, generic_processor};
++	    std::string processor;
+             // Get current xnack mode
+             std::string xnack = rocblas_internal_get_xnack_mode();
+ 
+@@ -806,59 +809,65 @@ namespace
+                 return 0;
+             }();
+ 
+-            path = base_path;
+-            if(TestPath(path + "/" + processor))
+-                path += "/" + processor;
++            // Loop over processors to find a valid Tensile library
++            // Only call rocblas_abort on the final processor
++            for(int i = 0; i < 2; ++i)
++            {
++	        processor = processors[i];
++
++		path = base_path;
++		if(TestPath(path + "/" + processor))
++		  path += "/" + processor;
+ 
+ #ifdef TENSILE_YAML
+-            tensileLibraryPath = path + "/TensileLibrary_lazy_" + processor + ".yaml";
++		tensileLibraryPath = path + "/TensileLibrary_lazy_" + processor + ".yaml";
+ #else
+-            tensileLibraryPath = path + "/TensileLibrary_lazy_" + processor + ".dat";
++		tensileLibraryPath = path + "/TensileLibrary_lazy_" + processor + ".dat";
+ #endif
+-            if(!TestPath(tensileLibraryPath))
+-            {
+-
++		if(TestPath(tensileLibraryPath)) {
++		  tensile_lazy_load_enabled = true;
++		  break;
++		}
+ #ifdef TENSILE_YAML
+-                tensileLibraryPath = path + "/TensileLibrary_" + processor + ".yaml";
++		tensileLibraryPath = path + "/TensileLibrary_" + processor + ".yaml";
+ #else
+-                tensileLibraryPath = path + "/TensileLibrary_" + processor + ".dat";
++		tensileLibraryPath = path + "/TensileLibrary_" + processor + ".dat";
+ #endif
+-                if(!TestPath(tensileLibraryPath))
+-                {
++		if(TestPath(tensileLibraryPath))
++		  break;
++
+ #ifdef TENSILE_YAML
+-                    tensileLibraryPath = path + "/TensileLibrary.yaml";
++		tensileLibraryPath = path + "/TensileLibrary.yaml";
+ #else
+-                    tensileLibraryPath = path + "/TensileLibrary.dat";
++		tensileLibraryPath = path + "/TensileLibrary.dat";
+ #endif
+-                    if(!TestPath(tensileLibraryPath))
+-                    {
++		if(TestPath(tensileLibraryPath))
++		  break;
++
+ #if ROCBLAS_TENSILE_SEPARATE_ARCH
+-                        rocblas_cerr << "\nrocBLAS error: Cannot read " << tensileLibraryPath
+-                                     << ": " << strerror(errno) << " for GPU arch : " << processor
+-                                     << std::endl;
++		rocblas_cerr << "\nrocBLAS error: Cannot read " << tensileLibraryPath
++			     << ": " << strerror(errno) << " for GPU arch : " << processor
++			     << std::endl;
+ #if ROCBLAS_TENSILE_LAZY_LOAD
+-                        std::regex fileMatcher(path + "/TensileLibrary_lazy.*");
++		std::regex fileMatcher(path + "/TensileLibrary_lazy.*");
+ #else
+-                        std::regex fileMatcher(path + "/TensileLibrary_gfx\\d+.dat");
++		std::regex fileMatcher(path + "/TensileLibrary_gfx\\d+.dat");
+ #endif
+-                        rocblas_cerr << " List of available TensileLibrary Files : " << std::endl;
+-                        for(auto& file_name : fs::directory_iterator(path))
+-                        {
+-                            if(std::regex_match(file_name.path().string(), fileMatcher))
+-                            {
+-                                rocblas_cerr << file_name << std::endl;
+-                            }
+-                        }
++		rocblas_cerr << " List of available TensileLibrary Files : " << std::endl;
++		for(auto& file_name : fs::directory_iterator(path))
++		  {
++		    if(std::regex_match(file_name.path().string(), fileMatcher))
++		      {
++			rocblas_cerr << file_name << std::endl;
++		      }
++		  }
+ #else
+-                        rocblas_cerr << "\nrocBLAS error: Cannot read " << tensileLibraryPath
+-                                     << ": " << strerror(errno) << std::endl;
++		rocblas_cerr << "\nrocBLAS error: Cannot read " << tensileLibraryPath
++			     << ": " << strerror(errno) << std::endl;
+ #endif
+-                        rocblas_abort();
+-                    }
+-                }
+-            }
+-            else
+-                tensile_lazy_load_enabled = true;
++		if (i == 1)
++		  rocblas_abort();
++	    }
+ 
+             //Supports multi architecture configuration in lazy library loading mode
+             static int initialize_once = [&] {
+-- 
+2.53.0
+

diff --git a/rocblas.spec b/rocblas.spec
index af1da25..5cdc99c 100644
--- a/rocblas.spec
+++ b/rocblas.spec
@@ -19,16 +19,19 @@
 # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 # THE SOFTWARE.
 #
-%bcond_with gitcommit
-%if %{with gitcommit}
-%global commit0 de5c1aebb641af098d9310a9fcca5591a7c066c8
-%global shortcommit0 %(c=%{commit0}; echo ${c:0:7})
-%global date0 20251015
-%endif
-
 %global upstreamname rocblas
+
+%bcond_with preview
+%if %{with preview}
+%global rocm_release 7.11
+%global rocm_patch 0
+%global pkg_src therock-%{rocm_release}
+%else
 %global rocm_release 7.2
 %global rocm_patch 0
+%global pkg_src rocm-%{rocm_release}.%{rocm_patch}
+%endif
+
 %global rocm_version %{rocm_release}.%{rocm_patch}
 
 %bcond_with compat
@@ -158,27 +161,32 @@ Name:           rocblas%{pkg_suffix}
 Summary:        BLAS implementation for ROCm
 License:        MIT AND BSD-3-Clause AND 0BSD
 URL:            https://github.com/ROCm/rocm-libraries
-
-%if %{with gitcommit}
-Version:        git%{date0}.%{shortcommit0}
-Release:        3%{?dist}
-Source0:        %{url}/archive/%{commit0}/rocm-libraries-%{shortcommit0}.tar.gz
-%else
 Version:        %{rocm_version}
-Release:        2%{?dist}
-Source0:        %{url}/releases/download/rocm-%{version}/%{upstreamname}.tar.gz#/%{upstreamname}-%{version}.tar.gz
+%if %{with preview}
+Release:        0%{?dist}
+%else
+Release:        3%{?dist}
 %endif
 
-Patch1:         0001-fixup-install-of-tensile-output.patch
+Source0:        %{url}/releases/download/%{pkg_src}/%{upstreamname}.tar.gz#/%{upstreamname}-%{version}.tar.gz
+Source1:        %{url}/releases/download/%{pkg_src}/tensile.tar.gz#/tensile-%{version}.tar.gz
 
-# Bundled tensile
-Source1:        https://github.com/ROCmSoftwarePlatform/Tensile/archive/rocm-%{version}.tar.gz#/Tensile-%{version}.tar.gz
+%if %{with preview}
+Patch1:         0001-improve-the-warning-for-asm-caps-mismatches.patch
+Patch2:         0002-add-generic-gpu-targets.patch
+Patch3:         0003-improve-fallback-name-to-handle-generics.patch
+Patch4:         0004-generic-arches-need-a-solution-index.patch
+Patch5:         0005-rocblas-add-rocblas_internal_get_generic_arch_name.patch
+Patch6:         0006-rocblas-generalize-finding-tensile-for-generics.patch
+%else
+Patch1:         0001-fixup-install-of-tensile-output.patch
 Patch101:       0001-tensile-fedora-gpus.patch
 Patch102:       0001-tensile-gfx1153.patch
 Patch103:       0001-tensile-set-default-paths.patch
 Patch104:       0001-tensile-ignore-cache-check.patch
 Patch105:       0001-tensile-add-cmake-arches.patch
 Patch106:       0001-tensile-gfx1036.patch
+%endif
 
 BuildRequires:  cmake
 BuildRequires:  gcc-c++
@@ -336,24 +344,29 @@ Requires:       diffutils
 %endif
 
 %prep
-%if %{with gitcommit}
-%setup -q -n rocm-libraries-%{commit0}
-cd projects/rocblas
-%patch -P1 -p1 
-%else
 %setup -q -n %{upstreamname}
+%if %{with preview}
+%patch -P5 -p3
+%patch -P6 -p3
+%else
 %patch -P1 -p1
 %endif
 
 tar xf %{SOURCE1}
-mv Tensile-* Tensile
-cd Tensile
+cd tensile
+%if %{with preview}
+%patch -P1 -p3
+%patch -P2 -p3
+%patch -P3 -p3
+%patch -P4 -p3
+%else
 %patch -P101 -p1
 %patch -P102 -p1
 %patch -P103 -p1
 %patch -P104 -p1
 %patch -P105 -p1
 %patch -P106 -p1
+%endif
 
 #Fix a few things:
 chmod 755 Tensile/Configs/miopen/convert_cfg.py
@@ -384,8 +397,13 @@ sed -i -e '/rich/d' requirements.*
 sed -i -e '/msgpack/d' requirements.*
 
 # Generalize prefix
+%if %{with preview}
+sed -i -e 's@DEFAULT_ROCM_BIN_PATH_POSIX = Path("/opt/rocm/bin")@DEFAULT_ROCM_BIN_PATH_POSIX = Path("%{pkg_prefix}/bin")@' Tensile/Utilities/Toolchain.py
+sed -i -e 's@DEFAULT_ROCM_LLVM_BIN_PATH_POSIX = Path("/opt/rocm/lib/llvm/bin")@DEFAULT_ROCM_LLVM_BIN_PATH_POSIX = Path("%{rocmllvm_bindir}")@' Tensile/Utilities/Toolchain.py
+%else
 sed -i -e 's@/usr/bin@%{pkg_prefix}/bin@' Tensile/Utilities/Toolchain.py
 sed -i -e 's@/usr/lib64/rocm/llvm/bin@%{rocmllvm_bindir}@' Tensile/Utilities/Toolchain.py
+%endif
 
 # Make sure hip/hip_runtime.h is found
 sed -i -e 's@"-D__HIP_HCC_COMPAT_MODE__=1"@"-D__HIP_HCC_COMPAT_MODE__=1","-I%{pkg_prefix}/include"@' Tensile/BuildCommands/SourceCommands.py
@@ -415,7 +433,7 @@ sed -i -e 's@list( APPEND COMMON_LINK_LIBS "-lgfortran")@#list( APPEND COMMON_LI
 
 %if %{with tensile}
 %if %{with bundled_tensile}
-cd Tensile
+cd tensile
 TL=$PWD
 python3 setup.py install --root $TL
 TP=${TL}/usr/lib/python%{python3_version}/site-packages/Tensile/
@@ -425,10 +443,6 @@ TP=`/usr/bin/TensileGetPath`
 %endif
 %endif
 
-%if %{with gitcommit}
-cd projects/rocblas
-%endif
-
 CORES=`lscpu | grep 'Core(s)' | awk '{ print $4 }'`
 if [ ${CORES}x = x ]; then
     CORES=1
@@ -452,10 +466,6 @@ export HIPCC_LINK_FLAGS_APPEND=-fuse-ld=lld
 %cmake_build
 
 %install
-%if %{with gitcommit}
-cd projects/rocblas
-%endif
-
 %cmake_install
 
 # Extra license
@@ -478,13 +488,8 @@ export LD_LIBRARY_PATH=%{_vpath_builddir}/library/src:$LD_LIBRARY_PATH
 %endif
 
 %files -n %{rocblas_name}
-%if %{with gitcommit}
-%license projects/rocblas/LICENSE.md
-%doc projects/rocblas/README.md
-%else
 %license LICENSE.md
 %doc README.md
-%endif
 %{pkg_prefix}/%{pkg_libdir}/librocblas.so.5{,.*}
 %if %{with tensile}
 %{pkg_prefix}/%{pkg_libdir}/rocblas/
@@ -501,6 +506,10 @@ export LD_LIBRARY_PATH=%{_vpath_builddir}/library/src:$LD_LIBRARY_PATH
 %endif
 
 %changelog
+* Sat Mar 7 2026 Tom Rix <Tom.Rix@amd.com> - 7.2.0-3
+- Change --with gitcommit to preview
+- Use rocm-libraries for tensile source
+
 * Sun Feb 15 2026 Tom Rix <Tom.Rix@amd.com> - 7.2.0-2
 - strip hsaco files
 - make test optional

diff --git a/sources b/sources
index 6749db9..7c48b91 100644
--- a/sources
+++ b/sources
@@ -1,2 +1,3 @@
 SHA512 (Tensile-7.2.0.tar.gz) = fc1946aa1c3ebddbdab02f6966d7ed08d937e17518d192b31a54d2084972188d8c71b8d1c58f0fd5d8455cc9a3e11414f1f7dbbfd284e0c90538264b9af2c4d0
 SHA512 (rocblas-7.2.0.tar.gz) = 5301a8822c4d3b9ea4223ebe001a80522605d0b2634d11e824043026fe8b148c424c4ffaa4402133dcb28857363c273aa56caa3533b91b0b6147e0289350ca1f
+SHA512 (tensile-7.2.0.tar.gz) = 8b17ee9fc2c0998242928ee923d82f7125d551940af71afc3bcfee90b02e654f9715e84f2caf2dd720e0904e670930b7a9e014b929ebeae04608ba7a128532dd

                 reply	other threads:[~2026-06-11 14:33 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=178118839140.1.11127389856776383527.rpms-rocblas-d4ebec97e49d@fedoraproject.org \
    --to=tom.rix@amd.com \
    --cc=git-commits@fedoraproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox