[rpms/gdb] gdb-17.2-rebase-f44: fix rhbz2366461 -- missing thread issue and solib entry address issue

public inbox for git-commits@fedoraproject.org
help / color / mirror / Atom feed

From: Andrew Burgess <aburgess@redhat.com>
To: git-commits@fedoraproject.org
Subject: [rpms/gdb] gdb-17.2-rebase-f44: fix rhbz2366461 -- missing thread issue and solib entry address issue
Date: Sun, 28 Jun 2026 00:02:21 GMT	[thread overview]
Message-ID: <178260494168.1.5132699532561625272.rpms-gdb-4ed07e90ac8a@fedoraproject.org> (raw)

            A new commit has been pushed.

            Repo   : rpms/gdb
            Branch : gdb-17.2-rebase-f44
            Commit : 4ed07e90ac8ac56d4da8c4a5e63764ffa9fbe6e4
            Author : Andrew Burgess <aburgess@redhat.com>
            Date   : 2026-04-23T16:54:13+01:00
            Stats  : +970/-1 in 6 file(s)
            URL    : https://src.fedoraproject.org/rpms/gdb/c/4ed07e90ac8ac56d4da8c4a5e63764ffa9fbe6e4?branch=gdb-17.2-rebase-f44

            Log:
            fix rhbz2366461 -- missing thread issue and solib entry address issue

Backport upstream commits 8bd08ee92c4 and cd289df068e to address
rhbz2366461.  These backports will not be needed once we rebase to GDB
18.

The rhbz2366461 bug unfortunately contains two separate bug reports,
they appear to have been incorrectly merged by libreport (I think).
The two backports in this commit fix the two issues.

---
diff --git a/_gdb.spec.Patch.include b/_gdb.spec.Patch.include
index 56bca73..e306477 100644
--- a/_gdb.spec.Patch.include
+++ b/_gdb.spec.Patch.include
@@ -50,3 +50,11 @@ Patch009: gdb-rhbz2413405-gcore-unreadable-pages.patch
 # path components in OUTDIR.gdb/testsuite: fix FAILs in fileio.exp
 Patch010: gdb-fileio-test-fixes.patch
 
+# Backport upstream commit 8bd08ee92c4 to address rhbz2366461.  This
+# commit will drop out with GDB 18.
+Patch011: gdb-rhbz2366461-missing-thread.patch
+
+# Backport upstream commit cd289df068e to address rhbz2366461.  This
+# commit will drop out with GDB 18.
+Patch012: gdb-rhbz2366461-bad-solib-entry-addr.patch
+

diff --git a/_gdb.spec.patch.include b/_gdb.spec.patch.include
index 6295406..dad4d0a 100644
--- a/_gdb.spec.patch.include
+++ b/_gdb.spec.patch.include
@@ -8,3 +8,5 @@
 %patch -p1 -P008
 %patch -p1 -P009
 %patch -p1 -P010
+%patch -p1 -P011
+%patch -p1 -P012

diff --git a/_patch_order b/_patch_order
index 6fe91da..ca215be 100644
--- a/_patch_order
+++ b/_patch_order
@@ -8,3 +8,5 @@ gdb-rhbz2403580-misplaced-symtabs.patch
 gdb-rhbz2435950-skip-revert.patch
 gdb-rhbz2413405-gcore-unreadable-pages.patch
 gdb-fileio-test-fixes.patch
+gdb-rhbz2366461-missing-thread.patch
+gdb-rhbz2366461-bad-solib-entry-addr.patch

diff --git a/gdb-rhbz2366461-bad-solib-entry-addr.patch b/gdb-rhbz2366461-bad-solib-entry-addr.patch
new file mode 100644
index 0000000..836ab5f
--- /dev/null
+++ b/gdb-rhbz2366461-bad-solib-entry-addr.patch
@@ -0,0 +1,453 @@
+From FEDORA_PATCHES Mon Sep 17 00:00:00 2001
+From: Andrew Burgess <aburgess@redhat.com>
+Date: Wed, 15 Apr 2026 10:43:31 +0100
+Subject: gdb-rhbz2366461-bad-solib-entry-addr.patch
+
+;; Backport upstream commit cd289df068e to address rhbz2366461.  This
+;; commit will drop out with GDB 18.
+
+gdb: don't use .text as default entry point section
+
+We got a Fedora GDB bug report that a user tried to debug an Appimage,
+and GDB would reliably crash like this:
+
+  (gdb) run
+  Starting program: /tmp/build/gdb/testsuite/outputs/gdb.base/solib-bad-entry-addr/solib-bad-entry-addr
+  ../../src/gdb/symfile.c:843: internal-error: sect_index_text not initialized
+  A problem internal to GDB has been detected,
+  further debugging may prove unreliable.
+  ----- Backtrace -----
+  ... etc ...
+
+The specific AppImage being debugged can be found here, I've modified
+the URL with a warning marker.  I make no claims about whether it is
+safe to download the image, and running it is definitely at your own
+risk.  If you wish to, delete the warning marker to download:
+
+  https://github.com/Murmele/Gittyup/<RUN AT YOUR OWN RISK>/releases/download/gittyup_v1.4.0/Gittyup-1.4.0-x86_64.AppImage
+
+At the point of the above crash GDB's stack is:
+
+  #9  0x000000000190c6ed in internal_error_loc (file=0x1e4a94e "../../src/gdb/symfile.c", line=838, fmt=0x1e4aa50 "sect_index_text not initialized") at ../../src/gdbsupport/errors.cc:57
+  #10 0x0000000000f5f5ac in init_entry_point_info (objfile=0x5a98e80) at ../../src/gdb/symfile.c:838
+  #11 0x0000000000f5f943 in syms_from_objfile (objfile=0x5a98e80, addrs=0x7ffd78728490, add_flags=...) at ../../src/gdb/symfile.c:962
+  #12 0x0000000000f5fe6d in symbol_file_add_with_addrs (abfd=..., name=0x3e76e50 "/tmp/.mount_GittyujmIBkD/usr/bin/../../home/runner/work/Gittyup/Qt/5.15.2/gcc_64/lib/./libicudata.so.56", add_flags=..., addrs=0x7ffd78728490, flags=..., parent=0x0) at ../../src/gdb/symfile.c:1071
+  #13 0x0000000000f601aa in symbol_file_add_from_bfd (abfd=..., name=0x3e76e50 "/tmp/.mount_GittyujmIBkD/usr/bin/../../home/runner/work/Gittyup/Qt/5.15.2/gcc_64/lib/./libicudata.so.56", add_flags=..., addrs=0x7ffd78728490, flags=..., parent=0x0) at ../../src/gdb/symfile.c:1145
+  #14 0x0000000000f0f2ad in solib_read_symbols (so=..., flags=...) at ../../src/gdb/solib.c:627
+  #15 0x0000000000f10263 in solib_add (pattern=0x0, from_tty=0, readsyms=1) at ../../src/gdb/solib.c:960
+
+From this we can see GDB is trying to add the shared library:
+
+  /tmp/.mount_GittyujmIBkD/usr/bin/../../home/runner/work/Gittyup/Qt/5.15.2/gcc_64/lib/./libicudata.so.56
+
+The internal error is triggered from these lines in
+init_entry_point_info:
+
+  if (!found)
+    ei->the_bfd_section_index = SECT_OFF_TEXT (objfile);
+
+Where SECT_OFF_TEXT is:
+
+  #define SECT_OFF_TEXT(objfile) \
+     ((objfile->sect_index_text == -1) \
+      ? (internal_error (_("sect_index_text not initialized")), -1)	\
+      : objfile->sect_index_text)
+
+So we can see that objfile::sect_index_text is -1, which leads to the
+internal error.
+
+Looking at the 'readelf -Wa ...' output for the shared library in
+question we see this:
+
+  ELF Header:
+    Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
+    Class:                             ELF64
+    Data:                              2's complement, little endian
+    Version:                           1 (current)
+    OS/ABI:                            UNIX - System V
+    ABI Version:                       0
+    Type:                              DYN (Shared object file)
+    Machine:                           Advanced Micro Devices X86-64
+    Version:                           0x1
+    Entry point address:               0x2d7
+    Start of program headers:          25051136 (bytes into file)
+    Start of section headers:          25047552 (bytes into file)
+    Flags:                             0x0
+    Size of this header:               64 (bytes)
+    Size of program headers:           56 (bytes)
+    Number of program headers:         7
+    Size of section headers:           64 (bytes)
+    Number of section headers:         11
+    Section header string table index: 7
+
+  Section Headers:
+    [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
+    [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
+    [ 1] .note.gnu.build-id NOTE            0000000000000190 000190 000024 00   A  0   0  4
+    [ 2] .gnu.hash         GNU_HASH        00000000000001b8 0001b8 000034 00   A  3   0  8
+    [ 3] .dynsym           DYNSYM          00000000000001f0 0001f0 000090 18   A 10   2  8
+    [ 4] .rodata           PROGBITS        00000000000002e0 0002e0 17e27d0 00   A  0   0 16
+    [ 5] .eh_frame         PROGBITS        00000000017e2ab0 17e2ab0 000000 00   A  0   0  8
+    [ 6] .dynamic          DYNAMIC         00000000019e2f10 17e2f10 0000f0 10  WA 10   0  8
+    [ 7] .shstrtab         STRTAB          0000000000000000 17e3000 000063 00      0   0  1
+    [ 8] .symtab           SYMTAB          0000000000000000 17e3068 000150 18      9  10  8
+    [ 9] .strtab           STRTAB          0000000000000000 17e31b8 000044 00      0   0  1
+    [10] .dynstr           STRTAB          00000000019e3188 17e4188 000420 00   A  0   0  8
+  Key to Flags:
+    W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
+    L (link order), O (extra OS processing required), G (group), T (TLS),
+    C (compressed), x (unknown), o (OS specific), E (exclude),
+    D (mbind), l (large), p (processor specific)
+
+  There are no section groups in this file.
+
+  Program Headers:
+    Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
+    LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x17e2ab0 0x17e2ab0 R   0x200000
+    GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x10
+    NOTE           0x000190 0x0000000000000190 0x0000000000000190 0x000024 0x000024 R   0x4
+    LOAD           0x17e2f10 0x00000000019e2f10 0x00000000019e2f10 0x0000f0 0x0000f0 RW  0x200000
+    DYNAMIC        0x17e2f10 0x00000000019e2f10 0x00000000019e2f10 0x0000f0 0x0000f0 RW  0x8
+    GNU_RELRO      0x17e2f10 0x00000000019e2f10 0x00000000019e2f10 0x0000f0 0x0000f0 R   0x1
+    LOAD           0x17e4000 0x00000000019e3000 0x00000000019e3000 0x0005a8 0x0005a8 RW  0x1000
+
+Things to note here are:
+
+  1. There really is no .text section, or any executable sections,
+
+  2. there are 3 LOAD segments.  This will be important later, and
+
+  3. the "Entry point address" is outside all sections, and is
+     non-zero.
+
+Next we can investigate where objfile::sect_index_text is set to
+something other than -1.  Starting in init_objfile_sect_indices, if
+the objfile has a ".text" section then sect_index_text can be set.
+This case clearly doesn't apply.
+
+Next symfile_find_segment_sections is called.  This tries to match a
+common case where we have either 1 or 2 LOAD segments, and assumes a
+default distribution of sections to segments.  However, we have 3 LOAD
+segments, so these lines:
+
+  if (data->segments.size () != 1 && data->segments.size () != 2)
+    return;
+
+result in an early return from symfile_find_segment_sections without
+sect_index_text being set.
+
+Back in init_objfile_sect_indices, if no sections have an offset then
+we set any currently unset sect_index_* values, including
+sect_index_text, to point at section 0.  However, in our case the
+objfile is a relocatable shared library, so the sections will have an
+offset, and so this final fallback case doesn't apply.
+
+The result is that init_objfile_sect_indices never sets
+sect_index_text.  This worries me a little as
+init_objfile_sect_indices contains this comment:
+
+  /* This is where things get really weird...  We MUST have valid
+     indices for the various sect_index_* members or gdb will abort.
+     So if for example, there is no ".text" section, we have to
+     accommodate that.  First, check for a file with the standard
+     one or two segments.  */
+
+Notice the emphasis on MUST in that comment, and indeed, we exit this
+function without setting sect_index_text, and GDB does indeed abort.
+The comment seems to imply that the following code is going to try to
+figure out a suitable stand-in sect_index_text for when there is no
+".text" section, but clearly I've run into a case that isn't covered.
+
+All of this code relating to setting sect_index_text was introduced in
+commit:
+
+  commit 31d99776c73d6fca13163da59c852b0fa99f89b8
+  Date:   Mon Jun 18 15:46:38 2007 +0000
+
+Which unfortunately is from a time where we didn't write useful commit
+messages, so to understand the commit you need to go read the mailing
+list archive, but they don't offer much more insight:
+
+  https://sourceware.org/pipermail/gdb-patches/2007-May/050527.html
+
+Clearly the comment in init_objfile_sect_indices would suggest that
+the fix here is to figure out some "fake" value for sect_index_text,
+and that would certainly avoid the problem here.  But, at least for
+this problem, I think there's maybe a better solution.
+
+The original internal error is triggered by a use of SECT_OFF_TEXT in
+init_entry_point_info.  We have an entry point address, we try to find
+the section index for the section containing the entry point, and
+failing that, we assume the entry point is in the text section.  This
+fall-back assumption means that, if the text section has an offset
+applied, then the entry point will also have that same offset
+applied.  But it's not clear to me why picking the text section is
+going to be any more valid than any other section, especially in a
+case like this where we don't even have a text section, so the
+sect_index_text might itself point to some other arbitrary section.
+
+Earlier in init_entry_point_info we already have a fall-back case
+where we set entry_info::entry_point_p to false to indicate that the
+objfile has no entry point, so this is always a possibility.  So I
+wondered about writing something like:
+
+  if (!found)
+    {
+      if (objfile->sect_index_text != -1)
+	ei->the_bfd_section_index = SECT_OFF_TEXT (objfile);
+      else
+        ei->entry_point_p = false;
+    }
+
+If we have no text section index then we just claim the objfile has no
+entry point.  But I didn't like this for two reasons, first, the
+comment back in init_objfile_sect_indices saying that the index should
+be set, this seems to indicate that we should not be making decisions
+later within GDB based on whether the index is set or not.
+
+And second, using the text section as a fall back, when the entry
+address is outside every section, just seems off.  So I wondered, why
+not just reject the entry point completely in this case?  Which is how
+I ended up with:
+
+  if (!found)
+    ei->entry_point_p = false;
+
+With this patch in place I was able to start debugging the AppImage
+linked above.
+
+I created a simple test case which reproduces this issue.  It's a
+little contrived because it has to hit all the points required to
+trigger this bug:
+
+  1. No .text section,
+
+  2. more than 2 LOAD segments, and
+
+  3. entry address outside every section.
+
+I have no idea what caused the original shared library to take on
+these characteristics, it might even be a tool issue building the
+original shared library.  I haven't investigated this, as I don't
+think it really matters, GDB shouldn't be crashing just because the
+incoming objects are a little weird.
+
+I've attached a link to the Fedora bug in the 'Bug:' tag, but it's a
+little confusing.  An automated system has merged together two bug
+reports.  As such the overall bug report linked too is for a
+completely different issue, only comments 21, 22, 23, and 24 relate to
+the bug being fixed here.
+
+Bug: https://bugzilla.redhat.com/show_bug.cgi?id=2366461#c21
+
+Approved-By: Tom Tromey <tom@tromey.com>
+
+diff --git a/gdb/symfile.c b/gdb/symfile.c
+--- a/gdb/symfile.c
++++ b/gdb/symfile.c
+@@ -822,7 +822,6 @@ init_entry_point_info (struct objfile *objfile)
+   if (ei->entry_point_p)
+     {
+       CORE_ADDR entry_point =  ei->entry_point;
+-      int found;
+ 
+       /* Make certain that the address points at real code, and not a
+ 	 function descriptor.  */
+@@ -834,7 +833,7 @@ init_entry_point_info (struct objfile *objfile)
+       ei->entry_point
+ 	= gdbarch_addr_bits_remove (objfile->arch (), entry_point);
+ 
+-      found = 0;
++      bool found = false;
+       for (obj_section &osect : objfile->sections ())
+ 	{
+ 	  struct bfd_section *sect = osect.the_bfd_section;
+@@ -845,13 +844,17 @@ init_entry_point_info (struct objfile *objfile)
+ 	    {
+ 	      ei->the_bfd_section_index
+ 		= gdb_bfd_section_index (objfile->obfd.get (), sect);
+-	      found = 1;
++	      found = true;
+ 	      break;
+ 	    }
+ 	}
+ 
++      /* We store the section index so that the entry address can be
++	 relocated when used.  If the entry address is outside of any
++	 section then we cannot relocate it.  Just claim that there is no
++	 entry address in this case.  */
+       if (!found)
+-	ei->the_bfd_section_index = SECT_OFF_TEXT (objfile);
++	ei->entry_point_p = false;
+     }
+ }
+ 
+diff --git a/gdb/testsuite/gdb.base/solib-bad-entry-addr-lib.s b/gdb/testsuite/gdb.base/solib-bad-entry-addr-lib.s
+new file mode 100644
+--- /dev/null
++++ b/gdb/testsuite/gdb.base/solib-bad-entry-addr-lib.s
+@@ -0,0 +1,33 @@
++/* Copyright 2026 Free Software Foundation, Inc.
++
++   This program is free software; you can redistribute it and/or modify
++   it under the terms of the GNU General Public License as published by
++   the Free Software Foundation; either version 3 of the License, or
++   (at your option) any later version.
++
++   This program is distributed in the hope that it will be useful,
++   but WITHOUT ANY WARRANTY; without even the implied warranty of
++   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
++   GNU General Public License for more details.
++
++   You should have received a copy of the GNU General Public License
++   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
++
++	.section .rodata
++	.globl lib_var
++	.type lib_var, @object
++lib_var:
++	.long 42
++	.size lib_var, .-lib_var
++
++	/* This executable section exists solely to force the linker
++	   to create an additional LOAD segment (with R+X permissions),
++	   giving the library 3+ LOAD segments instead of the default 2.
++
++	   This matters because GDB's symfile_find_segment_sections only
++	   runs for objects with exactly 1 or 2 segments.  When it runs,
++	   it sets sect_index_text from the first section in segment 1,
++	   which masks the bug we are testing for, see the .exp file for
++	   details.  */
++	.section .not_text, "ax", @progbits
++	.byte 0
+diff --git a/gdb/testsuite/gdb.base/solib-bad-entry-addr.c b/gdb/testsuite/gdb.base/solib-bad-entry-addr.c
+new file mode 100644
+--- /dev/null
++++ b/gdb/testsuite/gdb.base/solib-bad-entry-addr.c
+@@ -0,0 +1,22 @@
++/* Copyright 2026 Free Software Foundation, Inc.
++
++   This program is free software; you can redistribute it and/or modify
++   it under the terms of the GNU General Public License as published by
++   the Free Software Foundation; either version 3 of the License, or
++   (at your option) any later version.
++
++   This program is distributed in the hope that it will be useful,
++   but WITHOUT ANY WARRANTY; without even the implied warranty of
++   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
++   GNU General Public License for more details.
++
++   You should have received a copy of the GNU General Public License
++   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
++
++extern int lib_var;
++
++int
++main ()
++{
++  return lib_var;
++}
+diff --git a/gdb/testsuite/gdb.base/solib-bad-entry-addr.exp b/gdb/testsuite/gdb.base/solib-bad-entry-addr.exp
+new file mode 100644
+--- /dev/null
++++ b/gdb/testsuite/gdb.base/solib-bad-entry-addr.exp
+@@ -0,0 +1,99 @@
++# Copyright 2026 Free Software Foundation, Inc.
++
++# This program is free software; you can redistribute it and/or modify
++# it under the terms of the GNU General Public License as published by
++# the Free Software Foundation; either version 3 of the License, or
++# (at your option) any later version.
++#
++# This program is distributed in the hope that it will be useful,
++# but WITHOUT ANY WARRANTY; without even the implied warranty of
++# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
++# GNU General Public License for more details.
++#
++# You should have received a copy of the GNU General Public License
++# along with this program.  If not, see <http://www.gnu.org/licenses/>.
++
++# This test aims to exercise a specific situation which was seen
++# causing GDB to crash.  An application has a shared library that
++# meets the following conditions:
++#
++# 1. No ".text" section,
++# 2. at least 3 LOAD segments, and
++# 3. a non-zero entry address that is outside of every section.
++#
++# When these 3 conditions are met then GDB would run into a problem in
++# init_entry_point_info for the shared library.  The non-zero entry
++# address means GDB would try to find the section corresponding to the
++# address.  As the address is outside of all sections then GDB would
++# try to use the .text section as a fall-back.  But there is no text
++# section, so an internal error would be triggered.
++#
++# The 3 LOAD segments is important because, if there are only 1 or 2
++# segments GDB has a default in symfile_find_segment_sections where
++# it assumes the text section is the first section in the first
++# segment, which means init_entry_point_info will have a .text section
++# to use.
++#
++# This test has a very simple assembler file (which hopefully contains
++# no architecture specific content), which will compile to a couple of
++# sections, but no .text section.  These sections force the creation
++# of more than 2 LOAD segments.  The compiler flags then set the entry
++# address to 0x1, which (we hope) is outside all sections.  This
++# assembler file is compiled into a shared library.
++
++require allow_shlib_tests
++require !use_gdb_stub
++require {istarget *-linux*}
++
++standard_testfile .c -lib.s
++
++# Build shared library.
++set lib_testfile ${testfile}-lib.so
++set lib_srcfile ${srcfile2}
++set lib_binfile [standard_output_file ${lib_testfile}]
++set lib_flags {shlib \
++		   additional_flags=-nostartfiles \
++		   additional_flags=-Wl,-e,0x1 \
++		   additional_flags=-Wl,-z,separate-code}
++if { [build_executable "build solib" $lib_testfile $lib_srcfile \
++	  $lib_flags] == -1 } {
++    return
++}
++
++# Build the test executable.
++if { [build_executable "build exec" $testfile $srcfile \
++	  [list debug pie shlib=${lib_binfile}]] == -1 } {
++    return
++}
++
++# Confirm we have more than 2 LOAD segments in the shared library.
++set readelf_program [gdb_find_readelf]
++set command "exec $readelf_program -Wl $lib_binfile"
++verbose -log "command is $command"
++set result [catch {{*}$command} output]
++verbose -log "result is $result"
++verbose -log "output is $output"
++if {$result != 0} {
++    fail "read program headers from $lib_testfile"
++    return
++}
++if {![regexp {\nProgram Headers:\n *Type [^\n]* Align\n(.*?)\n\n} $output trash phdr]} {
++    fail "no Program Headers found"
++    return
++}
++set load_segment_count 0
++foreach line [regexp -line -all -inline {^ *LOAD .* 0x[0-9]+$} $phdr] {
++    incr load_segment_count
++}
++if { $load_segment_count <= 2 } {
++    fail "not enough LOAD segments"
++    return
++}
++
++# Start GDB and run to main.  If the shared library is causing issues
++# then we will see an internal error once the inferior starts running.
++clean_restart $testfile
++
++if {![runto_main message]} {
++  return 0
++}

diff --git a/gdb-rhbz2366461-missing-thread.patch b/gdb-rhbz2366461-missing-thread.patch
new file mode 100644
index 0000000..9dacf97
--- /dev/null
+++ b/gdb-rhbz2366461-missing-thread.patch
@@ -0,0 +1,499 @@
+From FEDORA_PATCHES Mon Sep 17 00:00:00 2001
+From: Andrew Burgess <aburgess@redhat.com>
+Date: Fri, 16 May 2025 17:56:58 +0100
+Subject: gdb-rhbz2366461-missing-thread.patch
+
+;; Backport upstream commit 8bd08ee92c4 to address rhbz2366461.  This
+;; commit will drop out with GDB 18.
+
+gdb: crash if thread unexpectedly disappears from thread list
+
+A bug was reported to Red Hat where GDB was crashing with an assertion
+failure, the assertion message is:
+
+  ../../gdb/regcache.c:432: internal-error: get_thread_regcache: Assertion `thread->state != THREAD_EXITED' failed.
+
+The backtrace for the crash is:
+
+  #5  0x000055a21da8a880 in internal_vproblem(internal_problem *, const char *, int, const char *, typedef __va_list_tag __va_list_tag *) (problem=problem@entry=0x55a21e289060 <internal_error_problem>, file=<optimized out>, line=<optimized out>, fmt=<optimized out>, ap=ap@entry=0x7ffec7576be0) at ../../gdb/utils.c:477
+  #6  0x000055a21da8aadf in internal_verror (file=<optimized out>, line=<optimized out>, fmt=<optimized out>, ap=ap@entry=0x7ffec7576be0) at ../../gdb/utils.c:503
+  #7  0x000055a21dcbd055 in internal_error_loc (file=file@entry=0x55a21dd33b71 "../../gdb/regcache.c", line=line@entry=432, fmt=<optimized out>) at ../../gdbsupport/errors.cc:57
+  #8  0x000055a21d8baaa9 in get_thread_regcache (thread=thread@entry=0x55a258de3a50) at ../../gdb/regcache.c:432
+  #9  0x000055a21d74fa18 in print_signal_received_reason (uiout=0x55a258b649b0, siggnal=GDB_SIGNAL_TRAP) at ../../gdb/infrun.c:9287
+  #10 0x000055a21d7daad9 in mi_interp::on_signal_received (this=0x55a258af5f60, siggnal=GDB_SIGNAL_TRAP) at ../../gdb/mi/mi-interp.c:372
+  #11 0x000055a21d76ef99 in interps_notify<void (interp::*)(gdb_signal), gdb_signal&> (method=&virtual table offset 88, this adjustment 974682) at ../../gdb/interps.c:369
+  #12 0x000055a21d76e58f in interps_notify_signal_received (sig=<optimized out>, sig@entry=GDB_SIGNAL_TRAP) at ../../gdb/interps.c:378
+  #13 0x000055a21d75074d in notify_signal_received (sig=GDB_SIGNAL_TRAP) at ../../gdb/infrun.c:6818
+  #14 0x000055a21d755af0 in normal_stop () at ../../gdb/gdbthread.h:432
+  #15 0x000055a21d768331 in fetch_inferior_event () at ../../gdb/infrun.c:4753
+
+The user is using a build of GDB with 32-bit ARM support included, and
+they gave the following description for what they were doing at the
+time of the crash:
+
+  Suspended the execution of the firmware in Eclipse.  The gdb was
+  connected to JLinkGDBServer with activated FreeRTOS awareness JLink
+  plugin.
+
+So they are remote debugging with a non-gdbserver target.
+
+Looking in normal_stop() we see this code:
+
+  /* As we're presenting a stop, and potentially removing breakpoints,
+     update the thread list so we can tell whether there are threads
+     running on the target.  With target remote, for example, we can
+     only learn about new threads when we explicitly update the thread
+     list.  Do this before notifying the interpreters about signal
+     stops, end of stepping ranges, etc., so that the "new thread"
+     output is emitted before e.g., "Program received signal FOO",
+     instead of after.  */
+  update_thread_list ();
+
+  if (last.kind () == TARGET_WAITKIND_STOPPED && stopped_by_random_signal)
+    notify_signal_received (inferior_thread ()->stop_signal ());
+
+Which accounts for the transition from frame #14 to frame #13.  But it
+is the update_thread_list() call which interests me.  This call asks
+the target (remote target in this case) for the current thread list,
+and then marks threads exited based on the answer.
+
+And so, if a (badly behaved) target (incorrectly) removes a thread
+from the thread list, then the update_thread_list() call will mark the
+impacted thread as exited, even if GDB is currently handling a signal
+stop event for that target.
+
+My guess for what's going on here then is this:
+
+  1. Thread receives a signal.
+  2. Remote target sends GDB a stop with signal packet.
+  3. Remote decides that the thread is going away soon, and marks the
+     thread as exited.
+  4. GDB asks for the thread list.
+  5. Remote sends back the thread list, which doesn't include the
+     event thread, as the remote things this thread has exited.
+  6. GDB marks the thread as exited, and then proceeds to try and
+     print the signal stop event for the event thread.
+  7. Printing the signal stop requires reading registers, which
+     requires a regache.  We can only get a regcache for a non-exited
+     thread, and so GDB raises an assertion.
+
+Using the gdbreplay test frame work I was able to reproduce this
+failure using gdbserver.  I create an inferior with two threads, the
+main thread sends a signal to the second thread, GDB sees the signal
+arrive and prints this information for the user.
+
+Having captured the trace of this activity, I then find the thread
+list reply in the log file, and modify it to remove the second thread.
+
+Now, when I replay the modified log file I see the same assertion
+complaining about an attempt to get a regcache for an exited thread.
+
+I'm not entirely sure the best way to fix this.  Clearly the problem
+here is a bad remote target.  But, replies from a remote target
+should (in my opinion) not be considered trusted, as a consequence, we
+should not be asserting based on data coming from a remote.  Instead,
+we should be giving warnings or errors and have GDB handle the bad
+data as best it can.
+
+This is the second attempt to fix this issue, my first patch can be
+seen here:
+
+  https://inbox.sourceware.org/gdb-patches/062e438c8677e2ab28fac6183d2ea6d444cb9121.1747567717.git.aburgess@redhat.com
+
+In the first patch I was to checking in normal_stop, immediately after
+the call to update_thread_list, to see if the current thread was now
+marked as exited.  However CI testing showed an issue with this
+approach; I was already checking for many different TARGET_WAITKIND_*
+kinds where the "is the current thread exited" question didn't make
+sense, and it turns out that the list of kinds in my first attempt was
+already insufficient.
+
+Rather than trying to just adding to the list, in this revised patch
+I'm proposing to move the "is this thread exited" check inside the
+block which handles signal stop events.
+
+Right now, the only part of normal_stop which I know relies on the
+current thread not being exited is the call to notify_signal_received,
+so before calling notify_signal_received I check to see if the current
+thread is now exited.  If it is then I print a warning to indicate
+that the thread has unexpectedly exited and that the current
+command (continue/step/etc) has been cancelled, I then change the
+current event type to TARGET_WAITKIND_SPURIOUS.
+
+GDB's output now looks like this in all-stop mode:
+
+  (gdb) continue
+  Continuing.
+  [New Thread 3483690.3483693]
+  [Thread 3483690.3483693 exited]
+  warning: Thread 3483690.3483693 unexpectedly exited after non-exit event
+  [Switching to Thread 3483690.3483693]
+  (gdb)
+
+The non-stop output is identical, except we don't switch thread (stop
+events never trigger a thread switch in non-stop mode).
+
+The include test makes use of the gdbreplay framework, and tests in
+all-stop and non-stop modes.  I would like to do more extensive
+testing of GDB's state after the receiving the unexpected thread list,
+but due to using gdbreplay for testing, this is quite hard.  Many
+commands, especially those looking at thread state, are likely to
+trigger additional packets being sent to the remote, which causes
+gdbreplay to bail out as the new packet doesn't match the original
+recorded state.  However, I really don't think it is a good idea to
+change gdbserver in order to "fake" this error case, so for now, using
+gdbreplay is the best idea I have.
+
+Bug: https://bugzilla.redhat.com/show_bug.cgi?id=2366461
+
+diff --git a/gdb/infrun.c b/gdb/infrun.c
+--- a/gdb/infrun.c
++++ b/gdb/infrun.c
+@@ -9562,7 +9562,38 @@ normal_stop ()
+   update_thread_list ();
+ 
+   if (last.kind () == TARGET_WAITKIND_STOPPED && stopped_by_random_signal)
+-    notify_signal_received (inferior_thread ()->stop_signal ());
++    {
++      gdb_assert (inferior_ptid != null_ptid);
++
++      /* Calling update_thread_list pulls information from the target.  For
++	 native targets we can be (reasonably) sure that the information we
++	 get back is sane, but for remote targets, we cannot reply on the
++	 returned thread list to be correct.
++
++	 Specifically, a remote target (not gdbserver), has been seen to
++	 prematurely remove threads from the thread list after sending a
++	 signal stop event.  The consequence of this, is that the thread
++	 might now be exited.  This is bad as, trying to calling
++	 notify_signal_received will cause GDB to read registers for the
++	 current thread, but requesting the regcache for an exited thread
++	 will trigger an assertion.
++
++	 Check for the exited thread case here, and convert the stop reason
++	 to a spurious stop event.  The thread exiting will have already
++	 been reported (when the thread list was parsed), so making this a
++	 spurious stop will cause GDB to drop back to the prompt.  */
++      if (inferior_thread ()->state != THREAD_EXITED)
++	notify_signal_received (inferior_thread ()->stop_signal ());
++      else
++	{
++	  warning (_("command aborted, %s unexpectedly exited after signal stop event"),
++		   target_pid_to_str (inferior_thread ()->ptid).c_str ());
++
++	  /* Mark this as a spurious stop.  GDB will return to the
++	     prompt.  The warning above tells the user why.  */
++	  last.set_spurious ();
++	}
++    }
+ 
+   /* As with the notification of thread events, we want to delay
+      notifying the user that we've switched thread context until
+diff --git a/gdb/testsuite/gdb.replay/missing-thread.c b/gdb/testsuite/gdb.replay/missing-thread.c
+new file mode 100644
+--- /dev/null
++++ b/gdb/testsuite/gdb.replay/missing-thread.c
+@@ -0,0 +1,61 @@
++/* This testcase is part of GDB, the GNU debugger.
++
++   Copyright 2025 Free Software Foundation, Inc.
++
++   This program is free software; you can redistribute it and/or modify
++   it under the terms of the GNU General Public License as published by
++   the Free Software Foundation; either version 3 of the License, or
++   (at your option) any later version.
++
++   This program is distributed in the hope that it will be useful,
++   but WITHOUT ANY WARRANTY; without even the implied warranty of
++   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
++   GNU General Public License for more details.
++
++   You should have received a copy of the GNU General Public License
++   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
++
++#include <pthread.h>
++#include <assert.h>
++#include <signal.h>
++#include <stdio.h>
++#include <unistd.h>
++#include <stdbool.h>
++
++pthread_mutex_t g_mutex = PTHREAD_MUTEX_INITIALIZER;
++pthread_cond_t g_condvar = PTHREAD_COND_INITIALIZER;
++
++void *
++worker_function (void *arg)
++{
++  printf ("In worker, about to notify\n");
++  pthread_cond_signal (&g_condvar);
++
++  while (true)
++    sleep(1);
++
++  return NULL;
++}
++
++int
++main()
++{
++  pthread_t my_thread;
++
++  int result = pthread_create (&my_thread, NULL, worker_function, NULL);
++  assert (result == 0);
++
++  pthread_mutex_lock (&g_mutex);
++  pthread_cond_wait (&g_condvar, &g_mutex);
++
++  printf ("In main, have been woken.\n");
++  pthread_mutex_unlock (&g_mutex);
++
++  result = pthread_kill (my_thread, SIGTRAP);
++  assert (result == 0);
++
++  result = pthread_join (my_thread, NULL);
++  assert (result == 0);
++
++  return 0;
++}
+diff --git a/gdb/testsuite/gdb.replay/missing-thread.exp b/gdb/testsuite/gdb.replay/missing-thread.exp
+new file mode 100644
+--- /dev/null
++++ b/gdb/testsuite/gdb.replay/missing-thread.exp
+@@ -0,0 +1,237 @@
++# Copyright 2025 Free Software Foundation, Inc.
++#
++# This program is free software; you can redistribute it and/or modify
++# it under the terms of the GNU General Public License as published by
++# the Free Software Foundation; either version 3 of the License, or
++# (at your option) any later version.
++#
++# This program is distributed in the hope that it will be useful,
++# but WITHOUT ANY WARRANTY; without even the implied warranty of
++# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
++# GNU General Public License for more details.
++#
++# You should have received a copy of the GNU General Public License
++# along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
++
++# This test confirms how GDB handles a badly behaving remote target.  The
++# remote target reports a stop event (signal delivery), then, as GDB is
++# processing the stop it syncs the thread list with the remote.
++#
++# The badly behaving remote target was dropping the signaled thread from the
++# thread list at this point, that is, the thread appeared to exit before an
++# exit event had been sent to (and seen by) GDB.
++#
++# At one point this was causing an assertion failed.  GDB would try to
++# process the signal stop event, and to do this would try to read some
++# registers.  Reading registers requires a regcache, and GDB will only
++# create a regcache for a non-exited thread.
++
++load_lib gdbserver-support.exp
++load_lib gdbreplay-support.exp
++
++require allow_gdbserver_tests
++require has_gdbreplay
++
++standard_testfile
++
++if { [build_executable "failed to build exec" $testfile $srcfile {debug pthreads}] } {
++    return -1
++}
++
++# Start the inferior and record a remote log for our interaction with it.
++# All we do is start the inferior and wait for thread 2 to receive a signal.
++# Check that GDB correctly shows the signal as received.  LOG_FILENAME is
++# where we should write the remote log.
++proc_with_prefix record_initial_logfile { log_filename } {
++    clean_restart $::testfile
++
++    # Make sure we're disconnected, in case we're testing with an
++    # extended-remote board, therefore already connected.
++    gdb_test "disconnect" ".*"
++
++    gdb_test_no_output "set sysroot" \
++	"setting sysroot before starting gdbserver"
++
++    # Start gdbserver like:
++    #   gdbserver :PORT ....
++    set res [gdbserver_start "" $::binfile]
++    set gdbserver_protocol [lindex $res 0]
++    set gdbserver_gdbport [lindex $res 1]
++
++    gdb_test_no_output "set remotelogfile $log_filename" \
++	"setup remotelogfile"
++
++    # Connect to gdbserver.
++    if {![gdb_target_cmd $gdbserver_protocol $gdbserver_gdbport] == 0} {
++	unsupported "$testfile (couldn't start gdbserver)"
++	return
++    }
++
++    gdb_breakpoint main
++    gdb_continue_to_breakpoint "continuing to main"
++
++    gdb_test "continue" \
++	"Thread $::decimal \[^\r\n\]+ received signal SIGTRAP, .*"
++
++    gdb_test "disconnect" ".*" \
++	"disconnect after seeing signal"
++}
++
++# Copy the remote log from IN_FILENAME to OUT_FILENAME, but modify one
++# particular line.
++#
++# The line to be modified is the last <threads>...</threads> line, this is
++# the reply from the remote that indicates the thread list.  It is expected
++# that the thread list will contain two threads.
++#
++# Whn DROP_BOTH is true then both threads will be removed from the modified
++# line.  Otherwise, only the second thread is removed.
++proc update_replay_log { in_filename out_filename drop_both } {
++    # Read IN_FILENAME into a list.
++    set fd [open $in_filename]
++    set data [read $fd]
++    close $fd
++    set lines [split $data "\n"]
++
++    # Find the last line in LINES that contains the <threads> list.
++    set idx -1
++    for { set i 0 } { $i < [llength $lines] } { incr i } {
++	if { [regexp "^r.*<threads>.*</threads>" [lindex $lines $i]] } {
++	    set idx $i
++	}
++    }
++
++    # Modify the line by dropping the second thread.  This does assume
++    # the thread order as seen in the <threads>...</threads> list, but
++    # this seems stable for now.
++    set line [lindex $lines $idx]
++    set fixed_log false
++    if {[regexp "^(r .*<threads>\\\\n)(<thread id.*/>\\\\n)(<thread id.*/>\\\\n)(</threads>.*)$" $line \
++	     match part1 part2 part3 part4]} {
++	if { $drop_both } {
++	    set line $part1$part4
++	} else {
++	    set line $part1$part2$part4
++	}
++	set lines [lreplace $lines $idx $idx $line]
++	set fixed_log true
++    }
++
++    # Write all the lines to OUT_FILENAME
++    set fd [open $out_filename "w"]
++    foreach l $lines {
++	puts $fd $l
++    }
++    close $fd
++
++    # Did we manage to update the log file?
++    return $fixed_log
++}
++
++# Replay the test process using REMOTE_LOG as the logfile to replay.  If
++# EXPECT_ERROR is true then after the final 'continue' we expect GDB to give
++# an error as the required thread is missing.  When EXPECT_ERROR is false
++# then we expect the test to complete as normal.  NON_STOP is eithe 'on' or
++# 'off' and indicates GDBs non-stop mode.
++proc_with_prefix replay_with_log { remote_log expect_error non_stop } {
++    clean_restart $::testfile
++
++    # Make sure we're disconnected, in case we're testing with an
++    # extended-remote board, therefore already connected.
++    gdb_test "disconnect" ".*"
++
++    gdb_test_no_output "set sysroot"
++
++    set res [gdbreplay_start $remote_log]
++    set gdbserver_protocol [lindex $res 0]
++    set gdbserver_gdbport [lindex $res 1]
++
++    # Connect to gdbserver.
++    if {![gdb_target_cmd $gdbserver_protocol $gdbserver_gdbport] == 0} {
++	fail "couldn't connect to gdbreplay"
++	return
++    }
++
++    gdb_breakpoint main
++    gdb_continue_to_breakpoint "continuing to main"
++
++    if { $expect_error } {
++	set expected_output \
++	    [list \
++		 "\\\[Thread \[^\r\n\]+ exited\\\]" \
++		 "warning: command aborted, Thread \[^\r\n\]+ unexpectedly exited after signal stop event"]
++
++	if { !$non_stop } {
++	    lappend expected_output "\\\[Switching to Thread \[^\r\n\]+\\\]"
++	}
++
++	gdb_test "continue" [multi_line {*}$expected_output]
++    } else {
++	# This is the original behaviour, we see this when running
++	# with the unmodified log.
++	gdb_test "continue" \
++	    "Thread ${::decimal}(?: \[^\r\n\]+)? received signal SIGTRAP, .*"
++    }
++
++    gdb_test "disconnect" ".*" \
++	"disconnect after seeing signal"
++}
++
++# Run the complete test cycle; generate an initial log file, modify the log
++# file, then check that GDB correctly handles replaying the modified log
++# file.
++#
++# NON_STOP is either 'on' or 'off' and indicates GDB's non-stop mode.
++proc run_test { non_stop } {
++    if { $non_stop } {
++	set suffix "-ns"
++    } else {
++	set suffix ""
++    }
++
++    # The replay log is placed in 'replay.log'.
++    set remote_log [standard_output_file replay${suffix}.log]
++    set missing_1_log [standard_output_file replay-missing-1${suffix}.log]
++    set missing_2_log [standard_output_file replay-missing-2${suffix}.log]
++
++    record_initial_logfile $remote_log
++
++    if { ![update_replay_log $remote_log $missing_1_log false] } {
++	fail "couldn't update remote replay log (drop 1 case)"
++    }
++
++    if { ![update_replay_log $remote_log $missing_2_log true] } {
++	fail "couldn't update remote replay log (drop 2 case)"
++    }
++
++    with_test_prefix "with unmodified log" {
++	# Replay with the unmodified log.  This confirms that we can replay this
++	# scenario correctly.
++	replay_with_log $remote_log false $non_stop
++    }
++
++    with_test_prefix "missing 1 thread log" {
++	# Now replay with the modified log, this time the thread that receives
++	# the event should be missing from the thread list, GDB will give an
++	# error when the inferior stops.
++	replay_with_log $missing_1_log true $non_stop
++    }
++
++    with_test_prefix "missing 2 threads log" {
++	# When we drop both threads from the <threads> reply, GDB doesn't
++	# actually remove both threads from the inferior; an inferior must
++	# always have at least one thread.  So in this case, as the primary
++	# thread is first, GDB drops this, then retains the second thread, which
++	# is the one we're stopping in, and so, we don't expect to see the error
++	# in this case.
++	replay_with_log $missing_2_log false $non_stop
++    }
++}
++
++# Run the test twice, with non-stop on and off.
++foreach_with_prefix non_stop { on off } {
++    save_vars { ::GDBFLAGS } {
++	append ::GDBFLAGS " -ex \"set non-stop $non_stop\""
++	run_test $non_stop
++    }
++}

diff --git a/gdb.spec b/gdb.spec
index 447dd37..0775396 100644
--- a/gdb.spec
+++ b/gdb.spec
@@ -45,7 +45,7 @@ Version: 17.1
 
 # The release always contains a leading reserved number, start it at 1.
 # `upstream' is not a part of `name' to stay fully rpm dependencies compatible for the testing.
-Release: 4%{?dist}
+Release: 5%{?dist}
 
 License: GPL-3.0-or-later AND BSD-3-Clause AND FSFAP AND LGPL-2.1-or-later AND GPL-2.0-or-later AND LGPL-2.0-or-later AND LicenseRef-Fedora-Public-Domain AND GFDL-1.3-or-later AND LGPL-2.0-or-later WITH GCC-exception-2.0 AND GPL-3.0-or-later WITH GCC-exception-3.1 AND GPL-2.0-or-later WITH GNU-compiler-exception AND MIT
 # Do not provide URL for snapshots as the file lasts there only for 2 days.
@@ -932,6 +932,11 @@ fi
 # endif scl
 
 %changelog
+* Tue Apr 21 2025 Andrew Burgess <aburgess@redhat.com>
+- Backport upstream commits 8bd08ee92c4 and cd289df068e to address
+  rhbz2366461.  These backports will not be needed once we rebase to
+  GDB 18.
+
 * Wed Feb 25 2026 Kevin Buettner <kevinb@redhat.com>
 - Backport upstream commit c1da013915e from Kevin Buettner to fix
   gcore failures caused by glibc 2.42 guard page changes (RHBZ 2413405).

                 reply	other threads:[~2026-06-28  0:02 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=178260494168.1.5132699532561625272.rpms-gdb-4ed07e90ac8a@fedoraproject.org \
    --to=aburgess@redhat.com \
    --cc=git-commits@fedoraproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox