Skip to main content
Topic: System freeze a couple of yours after an upgrade. (Read 1603 times) previous topic - next topic
0 Members and 2 Guests are viewing this topic.

System freeze a couple of yours after an upgrade.

Today I've got my first freeze since forever.
I suspect either an issue with the new kernel, either with something GPU related.

Relevant hardware:
CPU: 7600X
GPU: RX9070

Relevant updated packages:
[2025-08-26T17:24:31+0200] [ALPM] upgraded mesa (1:25.1.7-1 -> 1:25.2.1-2)
[2025-08-26T17:24:36+0200] [ALPM] upgraded linux (6.16.1.artix1-1 -> 6.16.2.artix1-1)
[2025-08-26T17:24:38+0200] [ALPM] upgraded linux-headers (6.16.1.artix1-1 -> 6.16.2.artix1-1)
[2025-08-26T17:24:40+0200] [ALPM] upgraded xf86-video-amdgpu (23.0.0-2.2 -> 25.0.0-1)

I'm using openrc for my init.

I would love to have a way to keep a crash log / dmesg / kernel panic when something like this happens but so far I've not found a way to do it.

I wonder: would a syslog network server to keep critical logs helpful?

Re: System freeze a couple of yours after an upgrade.

Reply #1
(keeping track for myself)

Rolling back the kernel didn't fix the issue.
I've setup a netconsole to one of my servers. I may capture something with the next crash.
Else, I'll have to look at kexec (which has no associated openrc?)

Also, keeping the full list of updated packages:

Code: [Select]
[2025-08-26T17:24:31+0200] [ALPM] upgraded harfbuzz (11.4.1-1 -> 11.4.3-1)
[2025-08-26T17:24:31+0200] [ALPM] upgraded gdbm (1.25-1 -> 1.26-1)
[2025-08-26T17:24:31+0200] [ALPM] upgraded mesa (1:25.1.7-1 -> 1:25.2.1-2)
[2025-08-26T17:24:32+0200] [ALPM] upgraded qt6-declarative (6.9.1-2 -> 6.9.1-3)
[2025-08-26T17:24:32+0200] [ALPM] upgraded qca-qt6 (2.3.10-2 -> 2.3.10-3)
[2025-08-26T17:24:32+0200] [ALPM] upgraded cryptsetup (2.8.0-1 -> 2.8.1-1)
[2025-08-26T17:24:32+0200] [ALPM] upgraded qt6-webengine (6.9.1-2 -> 6.9.1-3)
[2025-08-26T17:24:32+0200] [ALPM] upgraded libakonadi (25.08.0-1 -> 25.08.0-2)
[2025-08-26T17:24:32+0200] [ALPM] upgraded akonadi (25.08.0-1 -> 25.08.0-2)
[2025-08-26T17:24:32+0200] [ALPM] upgraded libxmlb (0.3.22-1 -> 0.3.23-1)
[2025-08-26T17:24:32+0200] [ALPM] upgraded appstream (1.0.5-2 -> 1.0.6-1)
[2025-08-26T17:24:32+0200] [ALPM] upgraded appstream-qt (1.0.5-2 -> 1.0.6-1)
[2025-08-26T17:24:32+0200] [ALPM] upgraded archlinux-appstream-data (20250730-1 -> 20250825-1)
[2025-08-26T17:24:32+0200] [ALPM] upgraded level-zero-loader (1.22.5-1 -> 1.23.2-1)
[2025-08-26T17:24:32+0200] [ALPM] upgraded pybind11 (3.0.0-1 -> 3.0.1-1)
[2025-08-26T17:24:33+0200] [ALPM] upgraded blender (17:4.5.1-4 -> 17:4.5.2-1)
[2025-08-26T17:24:33+0200] [ALPM] upgraded botan (3.8.1-1 -> 3.9.0-1)
[2025-08-26T17:24:33+0200] [ALPM] upgraded chromium (139.0.7258.66-1 -> 139.0.7258.138-1)
[2025-08-26T17:24:33+0200] [ALPM] upgraded cython (3.1.2-1 -> 3.1.3-1)
[2025-08-26T17:24:33+0200] [ALPM] upgraded git (2.50.1-3 -> 2.51.0-1)
[2025-08-26T17:24:33+0200] [ALPM] upgraded harfbuzz-icu (11.4.1-1 -> 11.4.3-1)
[2025-08-26T17:24:34+0200] [ALPM] upgraded imagemagick (7.1.2.1-1 -> 7.1.2.2-1)
[2025-08-26T17:24:34+0200] [ALPM] upgraded keepassxc (2.7.10-3 -> 2.7.10-4)
[2025-08-26T17:24:34+0200] [ALPM] upgraded krita (5.2.11-1 -> 5.2.11-2)
[2025-08-26T17:24:34+0200] [ALPM] upgraded ktextaddons (1.7.0-1 -> 1.7.1-1)
[2025-08-26T17:24:34+0200] [ALPM] upgraded ldb (2:4.22.3-1 -> 2:4.22.4-1)
[2025-08-26T17:24:34+0200] [ALPM] upgraded lib32-harfbuzz (11.4.1-1 -> 11.4.3-1)
[2025-08-26T17:24:34+0200] [ALPM] upgraded lib32-lcms2 (2.16-1 -> 2.17-1)
[2025-08-26T17:24:34+0200] [ALPM] upgraded lib32-mesa (1:25.1.7-1 -> 1:25.2.1-1)
[2025-08-26T17:24:34+0200] [ALPM] upgraded opencl-mesa (1:25.1.7-1 -> 1:25.2.1-2)
[2025-08-26T17:24:34+0200] [ALPM] upgraded lib32-opencl-mesa (1:25.1.7-1 -> 1:25.2.1-1)
[2025-08-26T17:24:34+0200] [ALPM] upgraded lib32-sqlite (3.50.2-1 -> 3.50.4-1)
[2025-08-26T17:24:34+0200] [ALPM] upgraded vulkan-intel (1:25.1.7-1 -> 1:25.2.1-2)
[2025-08-26T17:24:34+0200] [ALPM] upgraded lib32-vulkan-intel (1:25.1.7-1 -> 1:25.2.1-1)
[2025-08-26T17:24:34+0200] [ALPM] upgraded vulkan-mesa-layers (1:25.1.7-1 -> 1:25.2.1-2)
[2025-08-26T17:24:34+0200] [ALPM] upgraded lib32-vulkan-mesa-layers (1:25.1.7-1 -> 1:25.2.1-1)
[2025-08-26T17:24:34+0200] [ALPM] upgraded vulkan-radeon (1:25.1.7-1 -> 1:25.2.1-2)
[2025-08-26T17:24:34+0200] [ALPM] upgraded lib32-vulkan-radeon (1:25.1.7-1 -> 1:25.2.1-1)
[2025-08-26T17:24:34+0200] [ALPM] upgraded libcamera-ipa (0.5.1-2 -> 0.5.2-1)
[2025-08-26T17:24:34+0200] [ALPM] upgraded libcamera (0.5.1-2 -> 0.5.2-1)
[2025-08-26T17:24:34+0200] [ALPM] upgraded libodfgen (0.1.8-4 -> 0.1.8-5)
[2025-08-26T17:24:34+0200] [ALPM] upgraded liborcus (0.20.1-1 -> 0.20.2-1)
[2025-08-26T17:24:34+0200] [ALPM] upgraded libqalculate (5.6.0-1 -> 5.7.0-1)
[2025-08-26T17:24:34+0200] [ALPM] upgraded raptor (2.0.16-7 -> 2.0.16-8)
[2025-08-26T17:24:35+0200] [ALPM] upgraded libreoffice-fresh (25.2.5-2 -> 25.8.0-1)
[2025-08-26T17:24:35+0200] [ALPM] upgraded libsynctex (2025.2-1 -> 2025.2-2)
[2025-08-26T17:24:35+0200] [ALPM] upgraded libwbclient (2:4.22.3-1 -> 2:4.22.4-1)
[2025-08-26T17:24:36+0200] [ALPM] upgraded linux (6.16.1.artix1-1 -> 6.16.2.artix1-1)
[2025-08-26T17:24:38+0200] [ALPM] upgraded linux-headers (6.16.1.artix1-1 -> 6.16.2.artix1-1)
[2025-08-26T17:24:38+0200] [ALPM] upgraded mingw-w64-binutils (2.44-2 -> 2.45-1)
[2025-08-26T17:24:38+0200] [ALPM] upgraded nano (8.5-1 -> 8.6-1)
[2025-08-26T17:24:38+0200] [ALPM] upgraded nfsidmap (2.8.3-2 -> 2.8.3-3)
[2025-08-26T17:24:38+0200] [ALPM] upgraded nfs-utils (2.8.3-2 -> 2.8.3-3)
[2025-08-26T17:24:38+0200] [ALPM] upgraded qca-qt5 (2.3.10-2 -> 2.3.10-3)
[2025-08-26T17:24:38+0200] [ALPM] upgraded okteta (1:0.26.22-1 -> 1:0.26.23-1)
[2025-08-26T17:24:38+0200] [ALPM] upgraded opencl-headers (2:2024.10.24-1 -> 2:2025.07.22-1)
[2025-08-26T17:24:38+0200] [ALPM] upgraded python-coverage (7.10.4-1 -> 7.10.5-1)
[2025-08-26T17:24:38+0200] [ALPM] upgraded python-cryptography (45.0.5-1 -> 45.0.6-1)
[2025-08-26T17:24:38+0200] [ALPM] upgraded python-filelock (3.18.0-1 -> 3.19.1-1)
[2025-08-26T17:24:38+0200] [ALPM] upgraded python-flask (3.1.1-1 -> 3.1.2-1)
[2025-08-26T17:24:39+0200] [ALPM] upgraded python-hypothesis (6.136.2-1 -> 6.138.2-1)
[2025-08-26T17:24:39+0200] [ALPM] upgraded python-lxml (6.0.0-2 -> 6.0.1-1)
[2025-08-26T17:24:39+0200] [ALPM] upgraded python-multidict (6.6.3-1 -> 6.6.4-1)
[2025-08-26T17:24:39+0200] [ALPM] upgraded python-rpds-py (0.27.0-1 -> 0.27.1-1)
[2025-08-26T17:24:39+0200] [ALPM] upgraded python-setuptools-scm (9.2.0-1 -> 9.2.0-2)
[2025-08-26T17:24:39+0200] [ALPM] upgraded python-xarray (2025.07.1-1 -> 2025.08.0-1)
[2025-08-26T17:24:39+0200] [ALPM] upgraded smbclient (2:4.22.3-1 -> 2:4.22.4-1)
[2025-08-26T17:24:39+0200] [ALPM] upgraded samba (2:4.22.3-1 -> 2:4.22.4-1)
[2025-08-26T17:24:39+0200] [ALPM] upgraded strace (6.15-1 -> 6.16-1)
[2025-08-26T17:24:39+0200] [ALPM] upgraded texlive-bin (2025.2-1 -> 2025.2-2)
[2025-08-26T17:24:39+0200] [ALPM] upgraded texlive-basic (2025.2-1.1 -> 2025.2-2)
[2025-08-26T17:24:39+0200] [ALPM] upgraded texlive-langcjk (2025.2-1.1 -> 2025.2-2)
[2025-08-26T17:24:40+0200] [ALPM] upgraded texlive-langjapanese (2025.2-1.1 -> 2025.2-2)
[2025-08-26T17:24:40+0200] [ALPM] upgraded thin-provisioning-tools (1.2.0-1 -> 1.2.1-1)
[2025-08-26T17:24:40+0200] [ALPM] upgraded vulkan-swrast (1:25.1.7-1 -> 1:25.2.1-2)
[2025-08-26T17:24:40+0200] [ALPM] upgraded whois (5.6.1-1 -> 5.6.4-1)
[2025-08-26T17:24:40+0200] [ALPM] upgraded xf86-video-amdgpu (23.0.0-2.2 -> 25.0.0-1)
[2025-08-26T17:24:40+0200] [ALPM] upgraded yt-dlp (2025.07.21-1 -> 2025.08.22-1)
[2025-08-26T17:24:40+0200] [ALPM] upgraded zed (0.200.4-1 -> 0.200.5-1)


Re: System freeze a couple of yours after an upgrade.

Reply #3
Not yet.

This morning I woke up on a few kernel "Oops" that led to an unusable system.

Given the RX 9070 is relatively recent, I don't know if the LTS properly supports it.

I'm considering filing a bug report to the kernel devs but if they've fixed it after 6.16.2 it'll be a waste of everyone's time.
I was considering building a 6.16.4 (fixes many bugs) and see from there.

I've got no issue on that machine for quite a long time. The update made on the 26th of August is what broke it.
Before trying an LTS, rolling back to 6.15 may be preferred.




Configuring netconsole to target one of my home servers has already paid off.
The first Oops:

Code: [Select]
Aug 28 22:34:46 192.168.1.47 [11889.236825] BUG: unable to handle page fault for address: fffff00c38063248
Aug 28 22:34:46 192.168.1.47 [11889.236835] #PF: supervisor read access in kernel mode
Aug 28 22:34:46 192.168.1.47 [11889.236838] #PF: error_code(0x0000) - not-present page
Aug 28 22:34:46 192.168.1.47 [11889.236840] PGD 83deca067 P4D 83deca067 PUD 0
Aug 28 22:34:46 192.168.1.47 [11889.236845] Oops: Oops: 0000 [#1] SMP NOPTI
Aug 28 22:34:46 192.168.1.47 [11889.236849] CPU: 0 UID: 1000 PID: 3941 Comm: chrome_crashpad Not tainted 6.16.2-artix1-1 #1 PREEMPT(full)  37c67756271dd857632c3bdd9372ff663f9d2da3
Aug 28 22:34:46 192.168.1.47 [11889.236853] Hardware name: Micro-Star International Co., Ltd. MS-7D73/MPG B650I EDGE WIFI (MS-7D73), BIOS 1.H0 03/12/2025
Aug 28 22:34:46 192.168.1.47 [11889.236856] RIP: 0010:follow_page_pte+0xd1/0x470
Aug 28 22:34:46 192.168.1.47 [11889.236862] Code: 6f 01 49 89 c7 89 d8 48 81 e1 00 f0 ff ff 48 f7 d1 4c 21 f1 83 e0 01 89 44 24 0c 0f 85 50 01 00 00 4d 85 ff 0f 84 94 00 00 00 <4d> 8b 67 08 41 f6 c4 01 0f 85 5d 03 00 00 0f 1f 44 00 00 4d 89 fc
Aug 28 22:34:46 192.168.1.47 [11889.236865] RSP: 0018:ffffcf484b493b68 EFLAGS: 00010282
Aug 28 22:34:46 192.168.1.47 [11889.236868] RAX: 0000000000000000 RBX: 000000000005000a RCX: 800000000000082b
Aug 28 22:34:46 192.168.1.47 [11889.236870] RDX: 800000000000082b RSI: 0000044002c113a0 RDI: 8000004e018c982b
Aug 28 22:34:46 192.168.1.47 [11889.236873] RBP: ffff8d1847864780 R08: ffffcf484b493c38 R09: ffff8d189d4750b0
Aug 28 22:34:46 192.168.1.47 [11889.236875] R10: 0000000000000001 R11: ffff8d1898fb270c R12: 0000044002c113a0
Aug 28 22:34:46 192.168.1.47 [11889.236877] R13: ffff8d18e9458088 R14: 8000004e018c982b R15: fffff00c38063240
Aug 28 22:34:46 192.168.1.47 [11889.236880] FS:  00007fd04d9dc8c0(0000) GS:ffff8d1ed251b000(0000) knlGS:0000000000000000
Aug 28 22:34:46 192.168.1.47 [11889.236882] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 28 22:34:46 192.168.1.47 [11889.236885] CR2: fffff00c38063248 CR3: 000000019a2f9000 CR4: 0000000000f50ef0
Aug 28 22:34:46 192.168.1.47 [11889.236887] PKRU: 55555554
Aug 28 22:34:46 192.168.1.47 [11889.236889] Call Trace:
Aug 28 22:34:46 192.168.1.47 [11889.236892]  <TASK>
Aug 28 22:34:46 192.168.1.47 [11889.236895]  __get_user_pages+0xa63/0x1370
Aug 28 22:34:46 192.168.1.47 [11889.236900]  ? __free_frozen_pages+0x567/0x720
Aug 28 22:34:46 192.168.1.47 [11889.236906]  get_user_pages_remote+0x101/0x4e0
Aug 28 22:34:46 192.168.1.47 [11889.236910]  __access_remote_vm+0xe3/0x380
Aug 28 22:34:46 192.168.1.47 [11889.236915]  mem_rw+0x1b9/0x2b0
Aug 28 22:34:46 192.168.1.47 [11889.236921]  vfs_read+0xbc/0x390
Aug 28 22:34:46 192.168.1.47 [11889.236925]  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 28 22:34:46 192.168.1.47 [11889.236930]  __x64_sys_pread64+0x9c/0xd0
Aug 28 22:34:46 192.168.1.47 [11889.236933]  do_syscall_64+0x81/0x970
Aug 28 22:34:46 192.168.1.47 [11889.236938]  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 28 22:34:46 192.168.1.47 [11889.236941]  ? __x64_sys_pread64+0xb0/0xd0
Aug 28 22:34:46 192.168.1.47 [11889.236944]  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 28 22:34:46 192.168.1.47 [11889.236947]  ? do_syscall_64+0x81/0x970
Aug 28 22:34:46 192.168.1.47 [11889.236950]  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 28 22:34:46 192.168.1.47 [11889.236952]  ? do_syscall_64+0x81/0x970
Aug 28 22:34:46 192.168.1.47 [11889.236955]  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 28 22:34:46 192.168.1.47 [11889.236958]  ? exc_page_fault+0x7e/0x1a0
Aug 28 22:34:46 192.168.1.47 [11889.236962]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
Aug 28 22:34:46 192.168.1.47 [11889.236964] RIP: 0033:0x7fd04d69ca12
Aug 28 22:34:46 192.168.1.47 [11889.236967] Code: 08 0f 85 21 44 ff ff 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 48 83 ec 08
Aug 28 22:34:46 192.168.1.47 [11889.236970] RSP: 002b:00007ffecee528b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000011
Aug 28 22:34:46 192.168.1.47 [11889.236973] RAX: ffffffffffffffda RBX: 000055af14b1f268 RCX: 00007fd04d69ca12
Aug 28 22:34:46 192.168.1.47 [11889.236975] RDX: 0000000000000200 RSI: 000027d000148000 RDI: 0000000000000008
Aug 28 22:34:46 192.168.1.47 [11889.236977] RBP: 00007ffecee52a90 R08: 0000000000000000 R09: 0000000000000000
Aug 28 22:34:46 192.168.1.47 [11889.236979] R10: 0000044002c113a0 R11: 0000000000000246 R12: 0000044002c113a0
Aug 28 22:34:46 192.168.1.47 [11889.236982] R13: 000027d0000146c0 R14: 000027d000148000 R15: 0000000000000200
Aug 28 22:34:46 192.168.1.47 [11889.236986]  </TASK>
Aug 28 22:34:46 192.168.1.47 [11889.236988] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq rfcomm qrtr rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace nfs_localio netfs sunrpc uhid cmac algif_hash algif_skcipher af_alg bnep i2c_dev crypto_user amd_atl intel_rapl_msr intel_rapl_common mt7921e mt7921_common mt792x_lib snd_hda_codec_hdmi mt76_connac_lib kvm_amd mt76 snd_hda_intel vfat fat kvm snd_intel_dspcfg irqbypass mac80211 snd_intel_sdw_acpi btusb polyval_clmulni snd_usb_audio btrtl ghash_clmulni_intel snd_hda_codec sha512_ssse3 btintel snd_usbmidi_lib sha1_ssse3 btbcm snd_hda_core snd_ump aesni_intel btmtk snd_rawmidi libarc4 spd5118 rapl mousedev snd_seq_device snd_hwdep bluetooth cfg80211 snd_pcm wmi_bmof joydev pcspkr sp5100_tco mc snd_timer snd rfkill i2c_piix4 soundcore ccp k10temp i2c_smbus gpio_amdpt mac_hid gpio_generic hid_cherry hid_logitech_hidpp amdgpu amdxcp i2c_algo_bit drm_ttm_helper ttm drm_exec gpu_sched drm_suballoc_helper nvme drm_panel_backlight_quirks drm_buddy nvme_cor
Aug 28 22:34:46 192.168.1.47 [11889.237058]  nvme_keyring video cec nvme_auth wmi netconsole r8169 realtek mdio_devres libphy mdio_bus
Aug 28 22:34:46 192.168.1.47 [11889.237076] CR2: fffff00c38063248
Aug 28 22:34:46 192.168.1.47 [11889.237079] ---[ end trace 0000000000000000 ]---
Aug 28 22:34:46 192.168.1.47 [11889.237082] RIP: 0010:follow_page_pte+0xd1/0x470
Aug 28 22:34:46 192.168.1.47 [11889.237086] Code: 6f 01 49 89 c7 89 d8 48 81 e1 00 f0 ff ff 48 f7 d1 4c 21 f1 83 e0 01 89 44 24 0c 0f 85 50 01 00 00 4d 85 ff 0f 84 94 00 00 00 <4d> 8b 67 08 41 f6 c4 01 0f 85 5d 03 00 00 0f 1f 44 00 00 4d 89 fc
Aug 28 22:34:46 192.168.1.47 [11889.237088] RSP: 0018:ffffcf484b493b68 EFLAGS: 00010282
Aug 28 22:34:46 192.168.1.47 [11889.237091] RAX: 0000000000000000 RBX: 000000000005000a RCX: 800000000000082b
Aug 28 22:34:46 192.168.1.47 [11889.237094] RDX: 800000000000082b RSI: 0000044002c113a0 RDI: 8000004e018c982b
Aug 28 22:34:46 192.168.1.47 [11889.237096] RBP: ffff8d1847864780 R08: ffffcf484b493c38 R09: ffff8d189d4750b0
Aug 28 22:34:46 192.168.1.47 [11889.237098] R10: 0000000000000001 R11: ffff8d1898fb270c R12: 0000044002c113a0
Aug 28 22:34:46 192.168.1.47 [11889.237100] R13: ffff8d18e9458088 R14: 8000004e018c982b R15: fffff00c38063240
Aug 28 22:34:46 192.168.1.47 [11889.237102] FS:  00007fd04d9dc8c0(0000) GS:ffff8d1ed251b000(0000) knlGS:0000000000000000
Aug 28 22:34:46 192.168.1.47 [11889.237104] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 28 22:34:46 192.168.1.47 [11889.237107] CR2: fffff00c38063248 CR3: 000000019a2f9000 CR4: 0000000000f50ef0
Aug 28 22:34:46 192.168.1.47 [11889.237109] PKRU: 55555554
Aug 28 22:34:46 192.168.1.47 [11889.237111] note: chrome_crashpad[3941] exited with irqs disabled
Aug 28 22:34:46 192.168.1.47 [11889.237113] note: chrome_crashpad[3941] exited with preempt_count 1
Aug 28 22:34:51 192.168.1.47 [11894.237218] brave[16621]: segfault at 44002c11420 ip 00007f86eca4d993 sp 00007f86e77f4648 error 5
Aug 28 22:34:51 192.168.1.47 [11894.237218] brave[16620]: segfault at 44002c11420 ip 00007f86eca4d993 sp 00007f86e7ff5f48 error 5 in libwidevinecdm.so[c4c993,7f86ec92f000+696000]
Aug 28 22:34:51 192.168.1.47 [11894.237232]  in libwidevinecdm.so[c4c993,7f86ec92f000+696000]
Aug 28 22:34:51 192.168.1.47 [11894.237233]  likely on CPU 9 (core 3, socket 0)
Aug 28 22:34:51 192.168.1.47 [11894.237236]  likely on CPU 10 (core 4, socket 0)
Aug 28 22:34:51 192.168.1.47 [11894.237237] Code: 0c 16 48 8d 34 56 66 0f 7f 07 66 0f 7f 4f 20 f3 0f 6f 06 f3 0f 6f 0c 16 48 8d 34 56 66 0f 7f 47 40 66 0f 7f 4f 60 48 83 ef 80 <f3> 0f 6f 06 f3 0f 6f 0c 16 48 8d 34 56 66 0f 7f 07 66 0f 7f 4f 20
Aug 28 22:34:51 192.168.1.47 [11894.237241] Code: 0c 16 48 8d 34 56 66 0f 7f 07 66 0f 7f 4f 20 f3 0f 6f 06 f3 0f 6f 0c 16 48 8d 34 56 66 0f 7f 47 40 66 0f 7f 4f 60 48 83 ef 80 <f3> 0f 6f 06 f3 0f 6f 0c 16 48 8d 34 56 66 0f 7f 07 66 0f 7f 4f 20

It was pretty soon followed by kilobytes of stack traces.

Re: System freeze a couple of yours after an upgrade.

Reply #4
Rolling back the kernel didn't fix the issue.

Have you tried with the LTS kernel?
After many crashe and a failed attempt at making kdumpst work (it says it does but it doesnt) I've tried the LTS but the packages are broken (I need the r8168-lts package)
So instead I've reverted to 6.15.9 which wasn't crashing.
I still believe it's GPU related.

I've tried to switch to wayland but the crash happened way faster.


Re: System freeze a couple of yours after an upgrade.

Reply #5
I've rolled back mesa & co:
Code: [Select]
pacman -U *1:25.1.7-1*zst
That's the last suspicious set of packages.
And on my next reboot, I'm already set to switch back to X11.

After that, I'm stomped.

(Or I've got an hardware failure right at the same moment of the last update but what are the chance ...)

Or it's a firmware update that took time to fail.

Re: System freeze a couple of yours after an upgrade.

Reply #6
Have you downgraded everything that was upgraded? Unexpected things can cause problems sometimes, even though certain packages might look more likely culprits. I had some recent problems after an update and while it is quite possibly a separate issue, checking package integrity revealed a corrupted file at one point, I can only guess why, some broken thing on my system was prompted to write a byte to a random location under specific conditions? As initially similar oddness happened on 2 machines after recent updates a "HW problem" seems less likely.
https://forum.artixlinux.org/index.php/topic,8601.msg51816/topicseen.html#msg51816

Code: [Select]
paccheck --md5sum --quiet 

Re: System freeze a couple of yours after an upgrade.

Reply #7
Have you downgraded everything that was upgraded? Unexpected things can cause problems sometimes, even though certain packages might look more likely culprits. I had some recent problems after an update and while it is quite possibly a separate issue, checking package integrity revealed a corrupted file at one point, I can only guess why, some broken thing on my system was prompted to write a byte to a random location under specific conditions? As initially similar oddness happened on 2 machines after recent updates a "HW problem" seems less likely.
https://forum.artixlinux.org/index.php/topic,8601.msg51816/topicseen.html#msg51816

Code: [Select]
paccheck --md5sum --quiet 
Good idea, I'll try that next : thank you.

So far I've downgraded the kernel and everything GPU related (not the firmwares though). These are the typical culprits for these kind of issues.

My system has just crashed.  The netconsole didn't give me anything and the kdump thing isn't working at all. (I'm contemplating reinstalling a full systemd system like Arch or similar just so I can try to use these diagnostic features.  I may end up doing that if I can't have a fix by next weekend.)

One "funny" thing is that it's my living-room computer. My office one is also an up-to-date Artix and it is working fine. (But it has a 4070Ti Super instead of the RX 9070.)

edit: I've executed the command: no issue found.

If only I could have one hint about something wrong somewhere.
It's so random.
Sometimes it freezes.
Sometimes it freezes and the sounds loops (if I'm watching a video or playing a game)
Sometimes fans spin up "forever" (I presume the CPU is racing)

And the only trace I've ever had (but that didn't kill the system outright) was some page fault related to the "brave" browser (the system died a few minutes after that though)


Re: System freeze a couple of yours after an upgrade.

Reply #8
I've looked back at the only stack-trace I've found so far (with the latest kernel):
Code: [Select]
Aug 29 06:02:01 192.168.1.47 [38724.528429] watchdog: BUG: soft lockup - CPU#4 stuck for 109s! [kcompactd0:99]
Aug 29 06:02:01 192.168.1.47 [38724.528431] CPU#4 Utilization every 4s during lockup:
Aug 29 06:02:01 192.168.1.47 [38724.528432]     #1: 100% system,          0% softirq,     0% hardirq,     0% idle
Aug 29 06:02:01 192.168.1.47 [38724.528433]     #2: 100% system,          0% softirq,     0% hardirq,     0% idle
Aug 29 06:02:01 192.168.1.47 [38724.528434]     #3: 100% system,          0% softirq,     1% hardirq,     0% idle
Aug 29 06:02:01 192.168.1.47 [38724.528434]     #4: 100% system,          0% softirq,     0% hardirq,     0% idle
Aug 29 06:02:01 192.168.1.47 [38724.528435]     #5: 100% system,          0% softirq,     0% hardirq,     0% idle
Aug 29 06:02:01 192.168.1.47 [38724.528436] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq rfcomm qrtr rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace nfs_localio netfs sunrpc uhid cmac algif_hash algif_skcipher af_alg bnep i2c_dev crypto_user amd_atl intel_rapl_msr intel_rapl_common mt7921e mt7921_common mt792x_lib snd_hda_codec_hdmi mt76_connac_lib kvm_amd mt76 snd_hda_intel vfat fat kvm snd_intel_dspcfg irqbypass mac80211 snd_intel_sdw_acpi btusb polyval_clmulni snd_usb_audio btrtl ghash_clmulni_intel snd_hda_codec sha512_ssse3 btintel snd_usbmidi_lib sha1_ssse3 btbcm snd_hda_core snd_ump aesni_intel btmtk snd_rawmidi libarc4 spd5118 rapl mousedev snd_seq_device snd_hwdep bluetooth cfg80211 snd_pcm wmi_bmof joydev pcspkr sp5100_tco mc snd_timer snd rfkill i2c_piix4 soundcore ccp k10temp i2c_smbus gpio_amdpt mac_hid gpio_generic hid_cherry hid_logitech_hidpp amdgpu amdxcp i2c_algo_bit drm_ttm_helper ttm drm_exec gpu_sched drm_suballoc_helper nvme drm_panel_backlight_quirks drm_buddy nvme_cor
Aug 29 06:02:01 192.168.1.47 e drm_display_helper hid_logitech_dj
Aug 29 06:02:01 192.168.1.47 [38724.528474]  nvme_keyring video cec nvme_auth wmi netconsole r8169 realtek mdio_devres libphy mdio_bus
Aug 29 06:02:01 192.168.1.47 [38724.528481] CPU: 4 UID: 0 PID: 99 Comm: kcompactd0 Tainted: G      D      L      6.16.2-artix1-1 #1 PREEMPT(full)  37c67756271dd857632c3bdd9372ff663f9d2da3
Aug 29 06:02:01 192.168.1.47 [38724.528484] Tainted: [D]=DIE, [L]=SOFTLOCKUP
Aug 29 06:02:01 192.168.1.47 [38724.528484] Hardware name: Micro-Star International Co., Ltd. MS-7D73/MPG B650I EDGE WIFI (MS-7D73), BIOS 1.H0 03/12/2025
Aug 29 06:02:01 192.168.1.47 [38724.528485] RIP: 0010:native_queued_spin_lock_slowpath+0x67/0x2f0
Aug 29 06:02:01 192.168.1.47 [38724.528489] Code: 0f ba 29 08 0f 92 c2 8b 01 0f b6 d2 c1 e2 08 30 e4 09 d0 3d ff 00 00 00 77 55 85 c0 74 10 0f b6 01 84 c0 74 09 f3 90 0f b6 01 <84> c0 75 f7 b8 01 00 00 00 66 89 01 65 48 ff 05 8d c0 fa 01 e9 1b
Aug 29 06:02:01 192.168.1.47 [38724.528490] RSP: 0018:ffffcf484050b850 EFLAGS: 00000202
Aug 29 06:02:01 192.168.1.47 [38724.528491] RAX: 0000000000000001 RBX: ffffcf484050b8f8 RCX: fffff00b0aa51628
Aug 29 06:02:01 192.168.1.47 [38724.528492] RDX: 0000000000000000 RSI: 0000000000000001 RDI: fffff00b0aa51628
Aug 29 06:02:01 192.168.1.47 [38724.528493] RBP: 0000044002c79000 R08: ffffcf484050b868 R09: 00000000ffffffff
Aug 29 06:02:01 192.168.1.47 [38724.528493] R10: 0000008000000000 R11: 0000000000000000 R12: ffff8d174f45ae00
Aug 29 06:02:01 192.168.1.47 [38724.528494] R13: ffff8d1847864780 R14: ffffff8000000000 R15: ffff8d174f45ae00
Aug 29 06:02:01 192.168.1.47 [38724.528495] FS:  0000000000000000(0000) GS:ffff8d1ed261b000(0000) knlGS:0000000000000000
Aug 29 06:02:01 192.168.1.47 [38724.528495] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 29 06:02:01 192.168.1.47 [38724.528496] CR2: 00007f612010af78 CR3: 0000000106024000 CR4: 0000000000f50ef0
Aug 29 06:02:01 192.168.1.47 [38724.528497] PKRU: 55555554
Aug 29 06:02:01 192.168.1.47 [38724.528497] Call Trace:
Aug 29 06:02:01 192.168.1.47 [38724.528498]  <TASK>
Aug 29 06:02:01 192.168.1.47 [38724.528499]  _raw_spin_lock+0x29/0x30
Aug 29 06:02:01 192.168.1.47 [38724.528501]  page_vma_mapped_walk+0x65b/0x9d0
Aug 29 06:02:01 192.168.1.47 [38724.528504]  ? __lruvec_stat_mod_folio+0x85/0xd0
Aug 29 06:02:01 192.168.1.47 [38724.528507]  try_to_migrate_one+0x118/0xd20
Aug 29 06:02:01 192.168.1.47 [38724.528511]  rmap_walk_anon+0xd5/0x1f0
Aug 29 06:02:01 192.168.1.47 [38724.528514]  try_to_migrate+0x8d/0x160
Aug 29 06:02:01 192.168.1.47 [38724.528516]  ? __pfx_try_to_migrate_one+0x10/0x10
Aug 29 06:02:01 192.168.1.47 [38724.528517]  ? __pfx_folio_not_mapped+0x10/0x10
Aug 29 06:02:01 192.168.1.47 [38724.528518]  ? __pfx_folio_lock_anon_vma_read+0x10/0x10
Aug 29 06:02:01 192.168.1.47 [38724.528520]  ? __pfx_invalid_migration_vma+0x10/0x10
Aug 29 06:02:01 192.168.1.47 [38724.528521]  migrate_pages_batch+0x2a2/0xd20
Aug 29 06:02:01 192.168.1.47 [38724.528524]  ? __pfx_compaction_alloc+0x10/0x10
Aug 29 06:02:01 192.168.1.47 [38724.528526]  ? __pfx_compaction_free+0x10/0x10
Aug 29 06:02:01 192.168.1.47 [38724.528529]  ? __pfx_remove_migration_pte+0x10/0x10
Aug 29 06:02:01 192.168.1.47 [38724.528531]  ? __pfx_compaction_alloc+0x10/0x10
Aug 29 06:02:01 192.168.1.47 [38724.528532]  migrate_pages+0xb0e/0xe20
Aug 29 06:02:01 192.168.1.47 [38724.528534]  ? __pfx_compaction_free+0x10/0x10
Aug 29 06:02:01 192.168.1.47 [38724.528536]  ? __pfx_compaction_alloc+0x10/0x10
Aug 29 06:02:01 192.168.1.47 [38724.528539]  compact_zone+0x5c1/0x1060
Aug 29 06:02:01 192.168.1.47 [38724.528542]  compact_node+0xa9/0x120
Aug 29 06:02:01 192.168.1.47 [38724.528547]  kcompactd+0x343/0x440
Aug 29 06:02:01 192.168.1.47 [38724.528548]  ? __pfx_autoremove_wake_function+0x10/0x10
Aug 29 06:02:01 192.168.1.47 [38724.528552]  ? __pfx_kcompactd+0x10/0x10
Aug 29 06:02:01 192.168.1.47 [38724.528553]  kthread+0xf9/0x240
Aug 29 06:02:01 192.168.1.47 [38724.528555]  ? __pfx_kthread+0x10/0x10
Aug 29 06:02:01 192.168.1.47 [38724.528556]  ret_from_fork+0x197/0x1d0
Aug 29 06:02:01 192.168.1.47 [38724.528559]  ? __pfx_kthread+0x10/0x10
Aug 29 06:02:01 192.168.1.47 [38724.528560]  ret_from_fork_asm+0x1a/0x30
Aug 29 06:02:01 192.168.1.47 [38724.528564]  </TASK>
Based on this two typical culprits, if it was hardware related (physically or drivers) would be the NVMe or the GPU.
In my experience, if it's one of those, it's almost always the GPU.

edit: This system has been running almost non-stop since February 2023.

 

Re: System freeze a couple of yours after an upgrade.

Reply #9
Have you rebooted after the upgrade? You will be running on the same kernel you booted into until you reboot. Also you could check for any pacnew files - sometimes they might have important new config changes - OK, so nanorc is unlikely to crash the system, but some might be relevant. Another fairly easy thing to try would be to create a new user and boot into that, in case there was some config or cache in your homedir causing a problem. There have been previous issues with the caches for mesa or browsers after updates for example. Also if this happens with a particular browser (or other app) then perhaps try a different one (only to test with) - ideally unrelated, eg chromium or mozilla based, use the opposite choice. It doesn't seem my issue is related now.

Re: System freeze a couple of yours after an upgrade.

Reply #10
Have you rebooted after the upgrade? You will be running on the same kernel you booted into until you reboot. Also you could check for any pacnew files - sometimes they might have important new config changes - OK, so nanorc is unlikely to crash the system, but some might be relevant. Another fairly easy thing to try would be to create a new user and boot into that, in case there was some config or cache in your homedir causing a problem. There have been previous issues with the caches for mesa or browsers after updates for example. Also if this happens with a particular browser (or other app) then perhaps try a different one (only to test with) - ideally unrelated, eg chromium or mozilla based, use the opposite choice. It doesn't seem my issue is related now.
I may have reboote 70 times since ^_^.

My current solution is to write my own, working, "kdump".
It's almost done. It should target NFS, SSH and as the last fallback the filesystem.
I only need to fix some weird kink. Hopefully I'll have more time for this next week-end.