Greetings,
I have been unable to suspend my computer lately by running
loginctl suspend
or through the regular graphical options presented by
lightdm/xfce4 (also previously,
sddm/LXQT). When I first noticed this, I assumed this was because of some error I made and installed Artix afresh (which is why I have xfce4 now).
I have been unable to ascertain why this is. But one curious thing that I have noticed is that since this started happening,
atleast one (sometimes two) of the cores of my processor is being used 100% at all times, regardless of what I am running (picture attached). Could anyone let me know what the issue is or help me discover it?
Any and all help would be highly appreciated. Thanks. :)
Picture description: htop running within lxterminal on awesome window manager.
Do you know what process is actually eating away at your CPU? Maybe dbus is getting spammed with messages or something.
"K" would show kernel threads in htop, or top would show these by default. Whatever is using the CPU is not appearing in that htop view for some reason.
kworker/3:0+pm, kworker/2:0+pm, ksoftirqd/3, and ksoftirqd/2 are using up the CPU. I tried following the answer by tanius on <https://askubuntu.com/questions/33640/kworker-what-is-it-and-why-is-it-hogging-so-much-cpu> and discovered "Your BIOS is broken; bad RMRR [0x00000000ad800000-0x00000000afffffff]" and "no suspend buffer for LTR". I think the latter is the reason behind failure to suspend.
However, being a noob, I do not know what to do to resolve these issues. Any further assistance or guidance would be highly appreciated. :)
Open a root terminal and issue:
# dmesg -wHT &
# tail -f /var/log/everything.log &
The syslog daemon should capture all kernel messages and perhaps dmesg is redundant, but you never know. Also, if you're not running syslog-ng but metalog, then the logfile is
/var/log/everything/current.
Then, from the same terminal try to suspend either with
loginctl suspend or
echo mem >| /sys/power/state and post your output here.
Picture 1:
Picture 2:
Picture 3: I performed echo mem >| /sys/power/state
Ouch that dmesg output looks bad. It seems like some devices can't suspend and/or sending a wakeup event. It looks like a kernel bug and those processes you mentioned earlier are all kernel things. It's probably a driver bug.
So, is there a way to solve the issue?
If yes, please tell me how or tell me where to look for the solution.
If not, do tell me how to approach diagnosing problems like these so that next time I ask any question in the forum it is much simpler and succinct.
Thanks. :)
If you still have older kernel versions in your cache, you can try those and see if the issue occurs. You could also try some of the other kernel packages (like linux-lts) and see if the issue occurs there. But unfortunately aside from messing around with different kernel versions and hoping they work, there's not really much you can do.
Nah no worries, you did fine. A million things could cause suspend to fail and determining the probable cause as a kernel/driver bug in a few posts is pretty good.
One final question, do I
pacman -R
or
pacman -Rns
when I remove the old kernel? I do not want to break anything unnecessarily.
Assuming you have another kernel installed, I don't think it particularly matters. You probably shouldn't remove anything though if you're just testing different kernels (linux-lts vs linux or whatever) though. Just pick the one you want to boot into in the grub screen.
Hi. This is not going to solve the suspend issue but it will help you to get rid of "AER" message in your dmesg. I applied in my laptop because I have that annoying message too.
Edit /etc/default/grub.
in the line GRUB_CMDLINE_LINUX_DEFAULT add "pci=noaer" then execute:
grub-mkconfig -o /boot/grub/grub.cfg
I attached like I have if you want to check before editing anything.
About the suspend issue, I read it's a kernel problem. I see the problem is more common in Archlinux kernels.
https://www.reddit.com/r/linuxquestions/comments/53rze7/suspend_fails_because_of_usb_controller/
I have installed the new kernel and run
grub-mkconfig -o /boot/grub/grub.cfg
Output:
Generating grub configuration file ...
Found theme: /usr/share/grub/themes/artix/theme.txt
Found linux image: /boot/vmlinuz-linux-lts
Found initrd image: /boot/initramfs-linux-lts.img
Found fallback initrd image(s) in /boot: initramfs-linux-lts-fallback.img
Found linux image: /boot/vmlinuz-linux
Found initrd image: /boot/initramfs-linux.img
Found fallback initrd image(s) in /boot: initramfs-linux-fallback.img
WARNING: Failed to connect to lvmetad. Falling back to device scanning.
Found memtest86+ image: /boot/memtest86+/memtest.bin
done
However, I cannot find the new kernel in the grub menu. I tried to follow <https://joshtronic.com/2017/10/13/how-to-downgrade-to-the-lts-linux-kernel-on-arch/> but there is no /boot/loader/entries/ directory in my /boot/.
Also,
- this worked in removing the AER message. Thanks,
@jrballesteros05 .
I think you need to do "mkinitcpio -p linux-lts" as well.
I did that just now and rebooted but it still did not display the lts option.
Do I need to re-run grub-mkconfig again to make it appear?
Hello again. I've been reading about your suspend issue and many sites says that it would be a problem with "xhci" module. Could you execute this as root:
modprobe -r xhci
And the try to suspend?
Hi again. I replied with the "Quick reply" option but I cannot see my reply so I will post it again.
I've been reading about your suspend issue and it's seems that there is a problem with the "xhci" module.
Can you execute those commands:
modprobe -r xhci_pci
modprobe -r xhci_hcd
And try to suspend?
My output of dmesg shows no errors related to the xhci module. So I don't see how that will help.
The only errors I see in dmesg now are related to gpu and bluetooth.
Hi, one of the screenshots you share shows an error like this:
"usb_dev_suspend+0x0/0x10 return -16"
I googled it and the error seems to be related with "xhci" module. The commands I sent you were only for testing purposes when you reboot the machine modules will load automatically again. If the test works you might have to write a script (Or maybe it will be a better solution) which unloads the modules before suspending and loads when resume.
You can read here:
https://forums.fedoraforum.org/showthread.php?249335-Can-t-suspend-or-hibernate
Best regards.
That's odd. It should appear in grub. Did you run grub-mkconfig again? But even without it, I would have expect the entry to appear in your grub menu.
When adding new kernels, grub-mkconfig must be re-run for them to appear. Also, if the alternate kernel doesn't solve the xhci error, running rmmod xhci_pci xhci_hcd before suspend should fix it (provided the modules aren't in use by something else).
Your output includes:
WARNING: Failed to connect to lvmetad. Falling back to device scanning.
Do you have the osprober package installed? I once had to remove it to fix this.
Then run grub-mkconfig again.
I tried this and it solved both the issues:
Thanks,
@jrballesteros05 . :)
The problem was that I was running it on grub.conf. When I ran the same on grub.cfg , the options started appearing in the "Advanced options" section.
I have always had this error appear when I boot. I never looked into why because it never caused me any inconvenience or stopped me from using the PC normally. I tried to look for the package you mentioned but I could not find it through either
pacman -Q | grep "osprober" or
pacman -Ss osprober.
As I am going to mark this issue as solved (which I assume makes it turn off replies), I shall PM you with further details after I look into this error and try to find this package.
Thanks everyone for all the guidance and input. :)
Hello. This post is not solved at all. We just could find what is the problem but it isn't solved because xhci module it's necessary for USB ports.
This kernel bug is too old and it's quit strange that it hasn't been solved. What people normally do is a workaround which consist in 2 scripts.
One with this content:
#! /bin/bash
modprobe -r xhci_pci xhci_hcd
And the other with this one:
#! /bin/bash
modprobe xhci_pci xhci_hcd
In this post (https://bbs.archlinux.org/viewtopic.php?id=217775) from Archlinux forum they solved it using systemd but here there is no systemd installed, and in this post (https://bugs.launchpad.net/ubuntu/+source/pm-utils/+bug/562484/comments/3) from Ubuntu they solve through pm-utils but I don't think you have pm-utils installed and I am not sure if install it could be a good idea. I don't know either if there is something similar in "Runit" or "Elogind" to apply the workaround.
Hi again, I've been reading about Elogind and it could solve the problem at all. In the Artix wiki there is an entry (https://wiki.artixlinux.org/Main/Elogind) that talk about running scripts when suspending. And this one (https://wiki.gentoo.org/wiki/Elogind) from Gentoo wiki
So in
/usr/lib/elogind/system-suspend
you can execute scripts before suspending and when resume. So, it would be somtehing like this:
#!/bin/bash
case $1/$2 in
pre/*)
modprobe -r xhci_pci
;;
post/*)
modprobe xhci_pci
;;
esac
I am not an expert about this, if someone else see an error please make me know. I hope it can help you.
Best regards.
I'm not sure how removing osprober would affect this. This warning message has to do with lvm. It happens if your system is configured to load lvm volumes on boot but is not using the lvmetad daemon. It's not an error or anything; using device scanning to load lvm is perfectly fine. The daemon is supposed to be faster however.
As per the inputs here, this is what I have done: Here are the results/observations: Unless I have xhci_pci manually disabled, I cannot suspend
(so same as before, in essence).
With xhci_pci disabled :
- Can suspend
- Cannot use USB ports
With xhci_pci enabled :
- Cannot suspend
- Can use USB ports
- The battery drains faster than Vin Diesel races in the movies
Hello again, I think the artixwiki has a mistake (So Artix guys you should check this). The script must be in "/usr/lib/elogind/system-sleep/" instead of "/usr/lib/elogind/system-suspend".
I tried by myself and it works.
Of course there is a problem with this workaround, in my case I have an external keyboard connected in my laptop through USB so I cannot resume with the external keyboard. I must do it with laptop keyboard.
Best regards.
Yeah the docs say system-sleep not systemd-suspend. I'll update it. Thanks.
Thanks for investigating this. The wiki entry has already been fixed by
@Dudemanguy.
@Anaximenes , please mark this thread as [SOLVED] by editing the first post and clicking the green "SOLVE TOPIC" button.