Artix Linux Forum

Artix Linux => System => Topic started by: ####### on 14 December 2020, 17:32:48

Title: [SOLVED] Freeze - unfreeze - freeze, reboot to fix
Post by: ####### on 14 December 2020, 17:32:48
Earlier today I had the strange freezing thing again, which I had before using Terminator, (I used Terminator recently with no ill effects)  where the desktop would lock up for a few seconds then you could briefly move the pointer, so I closed all open apps and selected shutdown and rebooted, after rebooting it worked again. I had started xfce4 terminal, Pale Moon Browser, then when I started Chromium the freezing effect started before Chromium had fully drawn itself onscreen, it did complete eventually in the periods of motion, before I closed it again. There was nothing unusual in me running these apps though and it had not happened with them previously. After rebooting it was fine and I ran updates which updated the kernel (linux-zen) and I rebooted again.

The warnings, errors etc are in the pastebin link from /var/log/debug and some are normal to see, but some are not, there are another 2 successful boots at the end and the current session too, to give some idea of which are the usual things, plus here's what's config'd to go in there:

$outchannel mydebug,/var/log/debug,104857600
*.warning;*.err;*.crit;*.alert;*.emerg;*.panic  :omfile:$mydebug

https://pastebin.com/6AQbV8n9 (https://pastebin.com/6AQbV8n9)

Working fine since, but it was fine before too.
Title: Re: Freeze - unfreeze - freeze, reboot to fix
Post by: alium on 14 December 2020, 17:57:09
your driver or drm cause kernel panic... Unfortunately combination  nouveau or nvidia driver+ drm + linux kernel cause often the problems.
there is nothing strange when in the next release of kernel or drm it fix again.

The solution is one (from my point of view). Say "f**k you Nvidia" and buy next time supported HW for linux (Intel or AMD).

try update to 5.9.14
Title: Re: Freeze - unfreeze - freeze, reboot to fix
Post by: ####### on 14 December 2020, 20:55:40
Yes, already upgraded earlier today after that happened, I usually run updates several times a day now, everything is so reliable now and if anything, that approach is easier than leaving it longer:
$ uname -r
5.9.14-zen1-1-zen
But I had been running that kernel version for several days, and used these apps, with no bad effects, so this is a rarely seen effect, I wouldn't know if it was fixed or not:
[2020-12-09T13:34:05+0000] [ALPM] upgraded linux-zen (5.9.12.zen1-1 -> 5.9.13.zen1-1)
One day there will be fantastic open source graphics support, it does seem to be getting better overall, using Nouveau and posting details when things goes wrong is a small step towards that goal.
When it comes to used hw it mostly depends on what is about for a cheap price and the GPU is often not even mentioned, so it's pot luck, as it happens all my computers had Nvidia, perhaps why they were cheap  ;D
Title: Re: Freeze - unfreeze - freeze, reboot to fix
Post by: alium on 14 December 2020, 21:56:00
Quote
One day there will be fantastic open source graphics support

that day is here, you just can't buy ndivia   :D

amd graphics have great kernel support. I have a Ryzen 3400G and it's a great processor and GPU ...

I'm closing the thread, if anything, get in touch
Title: Re: Freeze - unfreeze - freeze, reboot to fix
Post by: tintin on 15 December 2020, 06:23:39
that day is here, you just can't buy ndivia   :D

amd graphics have great kernel support. I have a Ryzen 3400G and it's a great processor and GPU ...

I just replaced my nvidia card with its own defective driver with a radeon rx550: everything works without having to install anything 8-)

Title: Re: Freeze - unfreeze - freeze, reboot to fix
Post by: ####### on 17 December 2020, 22:37:56
Again today - just booted, started Terminator (which had been working earlier that day) then it started. When I tried CTRL-ALT-F2 nothing happened except the mouse pointer disappeared and nothing responded at all. But then with CTRL-ALT-F7 everything started working normally again, I could then switch TTY as normal,  and it's still fine a couple of hours later. The X session refreshed or something? I switched from XFCE to MATE now, to see if it happens in future.
https://pastebin.com/KX2RT38E (https://pastebin.com/KX2RT38E)

Code: [Select]
$ ag "(EE)" Xorg.0.log
[    10.413] Current Operating System: Linux ax 5.9.14-zen1-1-zen #1 ZEN SMP PREEMPT Sat, 12 Dec 2020 22:20:32 +0000 x86_64
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
[    10.424] (EE) Error systemd-logind returned paused fd for drm node
[    10.498] (EE) Failed to load module "nv" (module does not exist, 0)
[    10.662] (II) Initializing extension MIT-SCREEN-SAVER
[    11.825] (EE) systemd-logind: failed to take device /dev/dri/card0: Device or resource busy
[    11.825] (EE) /dev/dri/card0: failed to set DRM interface version 1.4: Permission denied
[    46.003] (EE) event12 - AlpsPS/2 ALPS DualPoint Stick: client bug: event processing lagging behind by 17ms, your system is too slow
[    67.997] (EE) client bug: timer event13 trackpoint: scheduled expiry is in the past (-357ms), your system is too slow
[    67.997] (EE) client bug: timer event13 trackpoint: scheduled expiry is in the past (-71ms), your system is too slow
[    67.997] (EE) client bug: timer event13 trackpoint: scheduled expiry is in the past (-60ms), your system is too slow
[    67.997] (EE) client bug: timer event13 trackpoint: scheduled expiry is in the past (-48ms), your system is too slow
[    67.997] (EE) client bug: timer event13 trackpoint: scheduled expiry is in the past (-35ms), your system is too slow
[    67.997] (EE) client bug: timer event13 trackpoint: scheduled expiry is in the past (-22ms), your system is too slow
[    67.998] (EE) client bug: timer event13 trackpoint: scheduled expiry is in the past (-10ms), your system is too slow
[    84.034] (EE) event13 - AlpsPS/2 ALPS DualPoint TouchPad: client bug: event processing lagging behind by 1250ms, your system is too slow
[   123.873] (EE) client bug: timer event12 debounce: scheduled expiry is in the past (-513ms), your system is too slow
[   123.873] (EE) client bug: timer event12 debounce: scheduled expiry is in the past (-399ms), your system is too slow
[   123.873] (EE) client bug: timer event12 debounce short: scheduled expiry is in the past (-412ms), your system is too slow
[   172.938] (EE) client bug: timer event13 middlebutton: scheduled expiry is in the past (-1937ms), your system is too slow
[   172.938] (EE) client bug: timer event13 middlebutton: scheduled expiry is in the past (-1709ms), your system is too slow
[   178.949] (EE) event13 - AlpsPS/2 ALPS DualPoint TouchPad: client bug: event processing lagging behind by 1973ms, your system is too slow
[   235.376] (EE) client bug: timer event12 debounce: scheduled expiry is in the past (-22ms), your system is too slow
[   257.387] (EE) event13 - AlpsPS/2 ALPS DualPoint TouchPad: client bug: event processing lagging behind by 402ms, your system is too slow
Title: Re: Freeze - unfreeze - freeze, reboot to fix
Post by: alium on 17 December 2020, 23:09:55
please use nvidia driver, not nouveau driver. Nouveau can have bug, or HW can fail... use another driver to exclude possible variants.

it's laptop or desktop? can be bad cable
Title: Re: Freeze - unfreeze - freeze, reboot to fix
Post by: ####### on 18 December 2020, 21:25:14
The MATE desktop was much worse, it froze up after a few minutes using Terminator or Mate terminal.
I downloaded the latest Linux kernel from the Arch testing repo:
linux-5.10.1.arch1-1-x86_64.pkg.tar.zst
and installed it with -U and made sure Grub booted that one.
So far MATE is running with 2 terminators, one Mate terminal and a few other apps open, although it's a job to be certain with this, after only a short time testing, it looks promising it might be fixed in 5.10. It might just be waiting to freeze any second now though!
$ uname -r
5.10.1-arch1-1
This is an Arch kernel not an Artix one, it was the newest precompiled kernel package I found. I have linux-zen installed too so could easily choose that in the grub menu if this had not worked, if anyone else tries this, a plan B if it doesn't work is a good idea, although it's not the first time I used an Arch kernel in Artix, no problems from that before.  ;D
Title: Re: Freeze - unfreeze - freeze, reboot to fix
Post by: alium on 18 December 2020, 23:20:54
The MATE desktop was much worse, it froze up after a few minutes using Terminator or Mate terminal.
I downloaded the latest Linux kernel from the Arch testing repo:
linux-5.10.1.arch1-1-x86_64.pkg.tar.zst
and installed it with -U and made sure Grub booted that one.
So far MATE is running with 2 terminators, one Mate terminal and a few other apps open, although it's a job to be certain with this, after only a short time testing, it looks promising it might be fixed in 5.10. It might just be waiting to freeze any second now though!
$ uname -r
5.10.1-arch1-1
This is an Arch kernel not an Artix one, it was the newest precompiled kernel package I found. I have linux-zen installed too so could easily choose that in the grub menu if this had not worked, if anyone else tries this, a plan B if it doesn't work is a good idea, although it's not the first time I used an Arch kernel in Artix, no problems from that before.  ;D
Don't install arch linux kernel !!! Dkms will not works! 5.10.1 (linux and linux-zen) are in our [testing] repo too.
Title: Re: Freeze - unfreeze - freeze, reboot to fix
Post by: ####### on 19 December 2020, 03:25:15
That would be a better idea (I've no dkms packages installed)
$ uname -r
5.10.1-artix1-1
It's under <mirrorname>/repos/gremlins/os/x86_64/ and I looked in system as the regular kernel is there  :-[
There's a 5.10 Zen version I haven't tried yet in there too.

No - bad idea, the Artix 5.10 still freezes! Back to the Arch 5.10, no issues with that so far.
Title: Re: Freeze - unfreeze - freeze, reboot to fix
Post by: alium on 19 December 2020, 08:19:20
That would be a better idea (I've no dkms packages installed)
$ uname -r
5.10.1-artix1-1
It's under <mirrorname>/repos/gremlins/os/x86_64/ and I looked in system as the regular kernel is there  :-[
There's a 5.10 Zen version I haven't tried yet in there too.

No - bad idea, the Artix 5.10 still freezes! Back to the Arch 5.10, no issues with that so far.
Artix and arch are same kernel, difference is only gcc signature
Title: Re: Freeze - unfreeze - freeze, reboot to fix
Post by: ####### on 19 December 2020, 20:16:55
Not exactly the same, according to .BUILDINFO (some minor buildenv differences) and .PKGINFO (different size, indicates different compiler options might have been used).
But while I was looking into that the Arch 5.10 started the freeze/unfreeze, although it had been fine for several hours of use over a few boots up to then. So that's not relevant. Trying the lts kernel now, so far so good:
$ uname -r
5.4.84-1-lts
Title: Re: Freeze - unfreeze - freeze, reboot to fix
Post by: alium on 19 December 2020, 21:36:00
Not exactly the same, according to .BUILDINFO (some minor buildenv differences) and .PKGINFO (different size, indicates different compiler options might have been used).
But while I was looking into that the Arch 5.10 started the freeze/unfreeze, although it had been fine for several hours of use over a few boots up to then. So that's not relevant. Trying the lts kernel now, so far so good:
$ uname -r
5.4.84-1-lts

Seems as HW problem for me...
Title: Re: Freeze - unfreeze - freeze, reboot to fix
Post by: ####### on 19 December 2020, 22:44:11
Yes, it's a possibility, (I had a faulty mobo once and after half an hour or so running the desktop froze and everything stopped dead, replacing it fixed it) except this suggests it's a known / current bug:
https://bbs.archlinux.org/viewtopic.php?id=259770 (https://bbs.archlinux.org/viewtopic.php?id=259770)
https://bugzilla.redhat.com/show_bug.cgi?id=1894257 (https://bugzilla.redhat.com/show_bug.cgi?id=1894257)
And I'm trying  to encourage it to happen by using Mate and Terminator, with 4.9 yesterday it would barely last a few minutes, and it didn't happen at all with XFCE / Pale Moon / XFCE4 terminal except with Terminator and once when I started Chromium as well.
Certainly hardware specific.
Title: Re: Freeze - unfreeze - freeze, reboot to fix
Post by: ####### on 21 December 2020, 15:24:03
Still working OK. Looking at various bug reports, it first appeared in Arch in the 5.8.14 kernel. Also affected is the 5.8.0-29-generic x86_64  Ubuntu 20.10 / 5.8.0-7630-generic Ubuntu, so they might have backported it in or use a different versioning scheme.
I might try the 5.8.14 to confirm this after testing the 5.4.84-1-lts more.
Title: Re: [SOLVED] Freeze - unfreeze - freeze, reboot to fix
Post by: ####### on 25 December 2020, 01:36:30
The 5.8.14 worked with no problems for an hour or two, the 5.9.14 worked with no problems for a few hours. While I was using the LTS some upgrades came in, including gtk2 and mesa. Could be fixed somewhere like that perhaps. - No, it was just waiting until I thought it had gone then did it again.
Title: Re: [SOLVED] Freeze - unfreeze - freeze, reboot to fix
Post by: ####### on 28 December 2020, 14:55:54
Still not had problems with 5.8.14-artix1-1, but the earliest to date was 5.9.2.artix1-1. Just downloaded linux-5.9.1.arch1-1-x86_64.pkg.tar.zst and linux-5.9.arch1-1-x86_64.pkg.tar.zst from Arch archive (not in my cache or Artix archive, 5.9.a skipped too)
Title: Re: [SOLVED] Freeze - unfreeze - freeze, reboot to fix
Post by: alium on 28 December 2020, 15:54:54
newest lts 5.4.85 works? if not, can be a regression
Title: Re: [SOLVED] Freeze - unfreeze - freeze, reboot to fix
Post by: ####### on 28 December 2020, 22:17:22
It's not that easy to tell what works or what doesn't, you can go all day with nothing then boot up and it freezes after 5 minutes, or perhaps later in the boot. linux-lts (5.4.84-1) ran for 4 days without trouble. linux (5.8.14.artix1-1) ran for 2 days without trouble.
All these froze sooner or later, within about a day of use, some on the first boot, others after around 24 hours after several boots:
5.9.2.artix1-1 / 5.9.10.artix1-1 / 5.9.12.artix1-1 / 5.9.14.artix1-1 / 5.10.1.arch1-1 / 5.10.1.artix1-1
So now I'm on 5.9.1-arch1-1 because I realised it's quicker working backwards, the faulty kernels take less time than running a good one for days. My current theory is something was introduced in the kernel in 5.8.14 as in the Arch forum discussion, but perhaps it wasn't added for my hardware until a bit later (as there are different modules and they don't always get done all at once) and if I can find when it happened, it might give some clue what it was. 5.8.14 was also the last release before the switch to 5.9.
So I think the LTS should work fine, and there's no problem with the Artix builds vs the Arch ones, but it would take the rest of the week to try it. Also the LTS I did try was a recent build, so presumably this isn't a compiler bug.

5.9.1 and 5.9.0 both had the problem. Tried 5.8.14 more and it's fine.
So this appeared with the 5.9 kernel, and it's still there in linux-git : 5.11.0-rc1-1-git-00073-g3516bd729358
Now - try to bisect. Soooo slooooowww.... 7 hours to clone and build linux-git, 22GB build folder, trying modprobed-db next, perhaps that will help a little.
Only an hour or two now, j1 so it runs in bg. v5.9-rc1 good, v5.9-rc6 bad. 1638 commits between, only 4 in drivers/gpu/drm/nouveau/ and 3 for nv50 family (my card):
ca386aa7155a drm/nouveau/kms/nv50-gp1xx: add WAR for EVO push buffer HW bug
a9cfcfcad50c drm/nouveau/kms/nv50-gp1xx: disable notifies again after core update
35dde8d40636 drm/nouveau/kms/nv50-: add some whitespace before debug message
v5.9-rc4 - bad. Next: fc8c70526bd30733ea8667adb8b8ffebea30a8ed just before the 4 nouveau commits, they are between rc3 and rc4. If that's good it will narrow it down a lot.
Title: Re: [SOLVED] Freeze - unfreeze - freeze, reboot to fix
Post by: ####### on 14 January 2021, 04:17:54
Closer  now...
ca386aa7155a drm/nouveau/kms/nv50-gp1xx: add WAR for EVO push buffer HW bug             < bug was here
a9cfcfcad50c drm/nouveau/kms/nv50-gp1xx: disable notifies again after core update  < testing here now!
35dde8d40636 drm/nouveau/kms/nv50-: add some whitespace before debug message < this only added a space in a comment
a255e9c8694d drm/nouveau/kms/gv100-: Include correct push header in crcc37d.c  < this isn't my "nv50"  gpu
fc8c70526bd3 drm/radeon: Prefer lower feedback dividers                                 < bug wasn't here

I think I see a possibility for the cause too:
Code: [Select]
bit from:
drivers/gpu/drm/nouveau/dispnv50/core507d.c

if (ntfy) {
PUSH_MTHD(push, NV507D, SET_NOTIFIER_CONTROL,
  NVDEF(NV507D, SET_NOTIFIER_CONTROL, MODE, WRITE) |
  NVVAL(NV507D, SET_NOTIFIER_CONTROL, OFFSET, NV50_DISP_CORE_NTFY >> 2) |
  NVDEF(NV507D, SET_NOTIFIER_CONTROL, NOTIFY, ENABLE));
}

PUSH_MTHD(push, NV507D, UPDATE, interlock[NV50_DISP_INTERLOCK_BASE] |
interlock[NV50_DISP_INTERLOCK_OVLY] |
  NVDEF(NV507D, UPDATE, NOT_DRIVER_FRIENDLY, FALSE) |
  NVDEF(NV507D, UPDATE, NOT_DRIVER_UNFRIENDLY, FALSE) |
  NVDEF(NV507D, UPDATE, INHIBIT_INTERRUPTS, FALSE),

SET_NOTIFIER_CONTROL,                                                   <<<<<< why is that floating about there, could it be a typo??
  NVDEF(NV507D, SET_NOTIFIER_CONTROL, NOTIFY, DISABLE));
Those seem to be lists of numbers (instructions) being "pushed" to the hardware.
That was added in "disable notifies again after core update" so, guessing pending more testing ;D
Some of the things in this area were borrowed from NVIDIA reading the commit messages, so perhaps it was in the NVIDIA driver first then was copied to Nouveau a bit later.
Still testing but no bug so far - looking more like its the "add WAR for EVO push buffer HW bug"
An interesting thing - you know why this has gone unfixed for 3 months? Ben Skeggs the nouveau kernel maintainer has vanished. Last commit on GitHub was Nov 14th. Nothing recent in the LKML either.
Title: Re: [SOLVED] Freeze - unfreeze - freeze, reboot to fix
Post by: ####### on 19 January 2021, 19:51:32
Beware that patch:
https://bugzilla.kernel.org/show_bug.cgi?id=210333 (https://bugzilla.kernel.org/show_bug.cgi?id=210333)
 ???
Title: Re: [SOLVED] Freeze - unfreeze - freeze, reboot to fix
Post by: ####### on 02 February 2021, 03:23:00
The patch that blew up my mobo is now in the mainline kernel development tree. They made some additional changes in nouveau so hopefully it won't have that effect any more and might fix the freezes instead, and should appear in the 5.11 kernel. I think I'll stick with the lts kernel till that's been out a while just in case...