GPU HANG: ecode 9:1:86dffffd, in Xorg [2210]
Kernel: 5.13.0-rc7-1-mainline-00073-g55fcd4493da5 x86_64
Driver: 20201103
Time: 1624705893 s 286164 us
Boottime: 55886 s 975106 us
Uptime: 759 s 263037 us
Capture: 4307623744 jiffies; 35447 ms ago
Active process (on ring rcs0): Xorg [2210]
Reset count: 0
Suspend count: 1
Platform: SKYLAKE
Subplatform: 0x0
PCI ID: 0x1912
PCI Revision: 0x06
PCI Subsystem: 1043:8694
IOMMU enabled?: 0
DMC loaded: yes
DMC fw version: 1.27
RPM wakelock: yes
PM suspended: no
GT awake: yes
EIR: 0x00000000
IER: 0x08080000
GTIER[0]: 0x09090909
GTIER[1]: 0x09090909
GTIER[2]: 0x00000000
GTIER[3]: 0x00000909
PGTBL_ER: 0x00000000
FORCEWAKE: 0xffff0001
DERRMR: 0x2077efef
fence[0] = 134603b00b40001
fence[1] = 00000000
fence[2] = 00000000
fence[3] = 00000000
fence[4] = 00000000
fence[5] = 00000000
fence[6] = 00000000
fence[7] = 00000000
fence[8] = 00000000
fence[9] = 00000000
fence[10] = 00000000
fence[11] = 00000000
fence[12] = 00000000
fence[13] = 00000000
fence[14] = 00000000
fence[15] = 00000000
fence[16] = 00000000
fence[17] = 00000000
fence[18] = 00000000
fence[19] = 00000000
fence[20] = 00000000
fence[21] = 00000000
fence[22] = 00000000
fence[23] = 00000000
fence[24] = 00000000
fence[25] = 00000000
fence[26] = 00000000
fence[27] = 00000000
fence[28] = 00000000
fence[29] = 00000000
fence[30] = 00000000
fence[31] = 00000000
ERROR: 0x00000000
DONE_REG: 0x07ffffff
FAULT_TLB_DATA: 0x0000001c 0x9704e00c
GTT_CACHE_EN: 0xf0007fff
rcs0 command stream:
CCID: 0x00000000
START: 0x00001000
HEAD: 0x00002da0 [0x00002d48]
TAIL: 0x00002eb8 [0x00002da8, 0x00002df8]
CTL: 0x00003001
MODE: 0x00000000
HWS: 0xffffe000
ACTHD: 0x0000fffe ec00bb90
IPEIR: 0x00000000
IPEHR: 0x79000002
ESR: 0x00000000
INSTDONE: 0xffdfffff
SC_INSTDONE: 0xfffffffe
SAMPLER_INSTDONE[0][0]: 0xffffffff
SAMPLER_INSTDONE[0][1]: 0xffffffff
SAMPLER_INSTDONE[0][2]: 0xffffffff
ROW_INSTDONE[0][0]: 0xffffffff
ROW_INSTDONE[0][1]: 0xffffffff
ROW_INSTDONE[0][2]: 0xffffffff
batch: [0x0000fffe_ec00a000, 0x0000fffe_ec014000]
BBADDR: 0x0000fffe_ec00bb91
BB_STATE: 0x00000020
INSTPS: 0x00009010
INSTPM: 0x00000000
FADDR: 0x0000fffe ec00bd80
RC PSMI: 0x00000010
FAULT_REG: 0x00000000
GFX_MODE: 0x00008000
PDP0: 0x0000000103508000
PDP1: 0x0000000000000000
PDP2: 0x0000000000000000
PDP3: 0x0000000000000000
hung: 1
engine reset count: 0
ELSP[0]: pid 2210, seqno b:000157fe, prio 0, head 00002e00, tail 00002eb8
ELSP[1]: pid 0, seqno 3:0000507d, prio 0, head 00000cc0, tail 00000d48
Active context: Xorg[2210] prio 0, guilty 1 active 0, runtime total 20516762616ns, avg 1227324ns
...
rcs0 --- user = 0x0000fffe ff001000
...
rcs0 --- user = 0x00000000 f8000000
...
rcs0 --- user = 0x00000000 e0000000
...
rcs0 --- ring = 0x00000000 00001000
...
rcs0 --- HW context = 0x00000000 fffc7000
...
available engines: 0
slice total: 0, mask=0000
subslice total: 0
EU total: 0
EU per subslice: 0
has slice power gating: no
has subslice power gating: no
has EU power gating: no
Unavailable
gen: 9
gt: 2
iommu: disabled
memory-regions: 5
page-sizes: 11000
platform: SKYLAKE
ppgtt-size: 48
ppgtt-type: 2
dma_mask_size: 39
is_mobile: no
is_lp: no
require_force_probe: no
is_dgfx: no
has_64bit_reloc: yes
gpu_reset_clobbers_display: no
has_reset_engine: yes
has_global_mocs: no
has_gt_uc: yes
has_l3_dpf: no
has_llc: yes
has_logical_ring_contexts: yes
has_logical_ring_elsq: no
has_master_unit_irq: no
has_pooled_eu: no
has_rc6: yes
has_rc6p: no
has_rps: yes
has_runtime_pm: yes
has_snoop: no
has_coherent_ggtt: yes
unfenced_needs_alignment: no
hws_needs_physical: no
cursor_needs_physical: no
has_csr: yes
has_ddi: yes
has_dp_mst: yes
has_dsb: no
has_dsc: no
has_fbc: yes
has_fpga_dbg: yes
has_gmch: no
has_hdcp: yes
has_hotplug: yes
has_hti: no
has_ipc: yes
has_modular_fia: no
has_overlay: no
has_psr: yes
has_psr_hw_tracking: yes
overlay_needs_physical: no
supports_tv: no
rawclk rate: 24000 kHz
Has logical contexts? yes
scheduler: 1f
i915.vbt_firmware=(null)
i915.modeset=-1
i915.lvds_channel_mode=0
i915.panel_use_ssc=-1
i915.vbt_sdvo_panel_type=-1
i915.enable_dc=-1
i915.enable_fbc=1
i915.enable_psr=-1
i915.psr_safest_params=no
i915.enable_psr2_sel_fetch=no
i915.disable_power_well=1
i915.enable_ips=1
i915.invert_brightness=0
i915.enable_guc=0
i915.guc_log_level=-1
i915.guc_firmware_path=(null)
i915.huc_firmware_path=(null)
i915.dmc_firmware_path=(null)
i915.mmio_debug=0
i915.edp_vswing=0
i915.reset=3
i915.inject_probe_failure=0
i915.fastboot=-1
i915.enable_dpcd_backlight=-1
i915.force_probe=
i915.fake_lmem_start=0
i915.request_timeout_ms=20000
i915.enable_hangcheck=yes
i915.load_detect_test=no
i915.force_reset_modeset_test=no
i915.error_capture=yes
i915.disable_display=no
i915.verbose_state_checks=yes
i915.nuclear_pageflip=no
i915.enable_dp_mst=yes
i915.enable_gvt=no
That sort of thing keeps happening. I thought mesa was to blame, but maybe it isn't. I tried to build xorg-server-git from the AUR, but it failed to start (there was an build error about xorgproto, even though I installed xorgproto-git). Is there a *working* PKGBUILD somewhere?
I am currently using the modesetting driver again - xf86-video-intel did not work either.
Enabling IOMMU didn't help - dmesg says scalable mode is supported, but it didn't work. At least the xorg freezes, unlike DRM freezes, are recoverable by killing the xserver.
And what does "i915.reset=3" mean? I did not set it on the commandline, and the default is 2 according to modinfo -p i915.
I tried downgrading intel-ucode (didn't seem to help either) because of the inordinate number of freezes I had lately - except with the kernel 5.4.y series, but as I've stated elsewhere, using that kernel is not a long-term solution. I've been building my own kernels using custom configs for years, including the 5.4 kernel; that can't be the problem, the 5.4 config is as close as possible to the 5.12 and mainline configs. I've been up for up to a week without freezes with 5.12, but of late it's just horrible. Totally broken.
I restored my system partitions from a known good backup and redid the recent updates, that didn't help either.
Can it really be that the kernel is to blame? I can't imagine Linus Torvalds putting up with simply *abandoning* i915 users. After all, Intel has a stake in the Linux kernel. The kernel doc says changes that don't work will be revoked, but that isn't happening.
Tried the distro kernel (not even up-to-date, and mirrors.dotsrc.org seems to be broken again, and I also get more alsa-lib errors). Currently trying to update my config using modprobed-db (aware of the possibility of kernel modules changing their names, don't need a lecture).
Bingo! distribution kernel (5.12.12.artix1-1) froze too. Expected no less, but had to try again, in case something changed. No weird programs running (vlc2, musescore, golly), just normal stuff that can be expected to work (mate-terminal, pluma, MATE desktop, all latest versions built from source, and gcc11. Many, but not all, freezes happen when gcc is running and I edit files with pluma (gedit and gvim (binary package) have been known to freeze too), or grep git logs). gtk3 is a possible freeze candidate, but not involved in all freezes, unless the entire MATE desktop is broken. But all other desktops are either toys (lxde, lxqt) severely dated (xfce) or bloated and full of spyware (gnome including flashback, kde, and presumably all the rest).
Guys, it's beginning to look like Artix doesn't work for me. I've been thinking for a while of switching to Gentoo (or FreeBSD if the Linux kernel deserts me), but I'll miss pacman.
Are you going to shut me down again because I mentioned software other than xorg-server in this post?
As I said, everything still works with the 5.4 kernel series, so I don't think I have hardware damage (or a rootkit, unless there is a rootkit that specifically targets kernels newer than 5.4).