Skip to main content
Topic solved
This topic has been marked as solved and requires no further attention.
Topic: Thunderbolt on Artix, (renamed, was: "Help building an old kernel (5.19.0)") (Read 1773 times) previous topic - next topic
0 Members and 6 Guests are viewing this topic.

Re: Thunderbolt on Artix, (renamed, was: "Help building an old kernel (5.19.0)")

Reply #30
Sure share it and maybe some other distros as well.
The spirit of sharing is why we love open source community so much.
Peace. :)

Re: Thunderbolt on Artix, (renamed, was: "Help building an old kernel (5.19.0)")

Reply #31
Sure share it and maybe some other distros as well.
The spirit of sharing is why we love open source community so much.
Peace. :)

 Yep I will share it on the Arch forum, and other Arch-derivated distros including Artix will probably inherit the fix.

In Debian-based distros it is completely different and the fix does not apply, and is maybe not even required.

Re: Thunderbolt on Artix, (renamed, was: "Help building an old kernel (5.19.0)")

Reply #32
Wait... I claimed victory too early...

As soon as I use the card, e.g. with nvidia-smi, or with python and torch.cuda.is_available(), the next time I use it the GPU is unavailable...

And immediately after the first use, dmesg adds these lines:

Code: [Select]
[  592.012595] NVRM objClInitPcieChipset: *** Chipset Setup Function Error!
[  594.048546] NVRM gpuInitOptimusSettings_IMPL: SBIOS did not acknowledge cfg space owner change
[  594.417099] NVRM s_executeBooterUcode_TU102: Booter failed with non-zero error code: 0xa
[  594.417102] NVRM kgspExecuteBooterUnloadIfNeeded_TU102: failed to execute Booter Unload: 0xffff
[  594.417107] NVRM nvAssertFailedNoLog: Assertion failed: rmStatus == NV_OK @ osinit.c:1926

Re: Thunderbolt on Artix, (renamed, was: "Help building an old kernel (5.19.0)")

Reply #33
And actually the situation is the same if I remove my "fix" in 60-nvidia.rules....

I just was too happy that pytorch was returning True and I was not at all testing thoroughly enough...

Re: Thunderbolt on Artix, (renamed, was: "Help building an old kernel (5.19.0)")

Reply #34
OK so the secret is to launch:

Code: [Select]
sudo rc-service nvidia-persistenced start

And it does not seem really necessary to launch the "missing" module nvidia_uvm (at least since I'm not using XOrg with the GPU).

NOTE: the exact order below is important:

* plug in the eGPU
* wait for the modules to load (no need for nvidia_uvm)
* sudo rc-service nvidia-persistenced start

For hot-unplugging (yes it is possible as long as XOrg is not involved, as it is our case), use the reverse order:

* sudo rc-service nvidia-persistenced stop
* sudo rmmod all of the nvidia* modules
* unplug

 

Re: Thunderbolt on Artix, (renamed, was: "Help building an old kernel (5.19.0)")

Reply #35
With the recent Artix upgrade (kernel 6.1.4-artix1-1, nvidia-open-dkms 525.78.01-8, etc), I don't need anymore to do any trick other than just installing and enabling boltd.

* I don't need anymore to start or stop the service nvidia-persistenced, which stays stopped at all times,
* I don't need to modify the udev rules
* the sequence for hotplug (for CUDA compute only) becomes: 1/ plug 2/ enjoy
* the sequence for hot-unplug (CUDA) becomes: 1/ rmmod all nvidia* drivers 2/ unplug