Skip to main content
Topic: etho keeps dropping and can't restart (Read 1089 times) previous topic - next topic
0 Members and 2 Guests are viewing this topic.

etho keeps dropping and can't restart

I am havig this trouble.  The link is physically going down on eth0
and it comes back up it does't put back up the correct route


Jan 17 21:34:48 www3 kernel: e1000e 0000:00:19.0 eth0: NIC Link is Down
 Jan 17 21:34:49 www3 named[16266]: listening on IPv4 interface eth0, 96.57.23.83#53
Jan 17 21:34:52 www3 ntpd[2822]: Deleting 27 eth0, [96.57.23.83]:123, stats:
received=949, sent=887, dropped=0, active_time=230059 secs

Jan 17 21:34:52 www3 kernel: e1000e 0000:00:19.0 eth0: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Jan 17 21:34:55 www3 ntpd[2822]: Listen normally on 30 eth0 96.57.23.83:123

I have to restart it manually

It drops the default route and I have to run /etc/init.d/net.eth0 restart

Code: [Select]
[www3 ~]# /etc/init.d/net.eth0 restart
 * Unmounting network filesystems ...                                                                         [ ok ]
 * Bringing down interface eth0
Error: ipv6: Router advertisement is disabled on device.
 * Bringing up interface eth0
 *   96.57.23.83/29 ...                                                                                       [ ok ]
 *   Adding routes
 *     default via 96.57.23.81 ...                                                                            [ ok ]
 *     96.57.23.80/29 via 96.57.23.81 dev eth0 ...                                                            [ ok ]
[www3 ~]#  * Mounting network filesystems ...                                                                 [ ok ]


First why can't it just remain up regardless of the condition of the connection.  Just stay UP
second - why can't it fetch the correct route when it does up and down?  I tis obviously not using my /etc/init.d/net.eth0 file.

Re: etho keeps dropping and can't restart

Reply #1
This is the 3rd report (the 1st 2 not in this forum) I noticed in the past 10 days about this issue.
It might very well be related to a kernel change.

This could very well be related to https://bugzilla.kernel.org/show_bug.cgi?id=118721

Comments 9 and 10 show workarounds:

Quote
Disabling TSO seems to have fixed the problem for me. (I needed to set it after a fresh boot, *before* the interface starts bailing out continually.)

Quote
Also hit this issue - might be helpful to others, reloading the module with the parameter Node=0 (The NUMA node my NIC is on - modprobe e1000e Node=0) appears to have worked around the issue.

I suggest to try these, also to clarify if it indeed is the same problem as reported here.

artist
Linux is simple; use Artix, or Submit Your System To Evil Malicious D(a)emons

Re: etho keeps dropping and can't restart

Reply #2


Quote
Also hit this issue - might be helpful to others, reloading the module with the parameter Node=0 (The NUMA node my NIC is on - modprobe e1000e Node=0) appears to have worked around the issue.



Solves it for me too with the e1000.


No issues with a wired nic laptop with

Code: [Select]
Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet Controller (rev 15)

The problem is present with our default and LTS kernel atm.

Re: etho keeps dropping and can't restart

Reply #3
This is the 3rd report (the 1st 2 not in this forum) I noticed in the past 10 days about this issue.
It might very well be related to a kernel change.

This could very well be related to https://bugzilla.kernel.org/show_bug.cgi?id=118721

Comments 9 and 10 show workarounds:

Quote
Disabling TSO seems to have fixed the problem for me. (I needed to set it after a fresh boot, *before* the interface starts bailing out continually.)

Quote
Also hit this issue - might be helpful to others, reloading the module with the parameter Node=0 (The NUMA node my NIC is on - modprobe e1000e Node=0) appears to have worked around the issue.

I suggest to try these, also to clarify if it indeed is the same problem as reported here.

artist


How do I set this modprobe setting in booting?  Does that require a grub configuration change?

BTW this is the  box with the mirror on it for artix.  ANd funky behavior for this modules has happened before - years ago.  I had to compile the module off of github and install it manually at one time which made a big problem for pacman updates.

Re: etho keeps dropping and can't restart

Reply #4
There is now 1 single report of a network disconnect with the modprobe modification enabled.

On a machine that sometimes shows the issues we are now testing with the networkmanager service disabled.

artist
Linux is simple; use Artix, or Submit Your System To Evil Malicious D(a)emons

Re: etho keeps dropping and can't restart

Reply #5
According to kernel change logs of 6.12 and upcoming 6.13, they changed, reverted commits and fixed e1000e driver.
With a bit of luck, 6.13 cures it.

Re: etho keeps dropping and can't restart

Reply #6
I'm currently building kernel 6.12.10 with 2 commits that *might* have introduced the problem undone.

artist
Linux is simple; use Artix, or Submit Your System To Evil Malicious D(a)emons

Re: etho keeps dropping and can't restart

Reply #7
For those who want to to test the updated kernel that *might* fix the problem:
- I cannot test if the problem is fixed as none of my machines show it. I did install the kernel and reboot as a basic test, and that worked fine.
- Make sure to have eg. kernel-lts installed, just in case
- Run pacman -U https://omniverse.artixlinux.org/x86_64/linux-6.12.10.artix1-1-x86_64.pkg.tar.zst
- Reboot and ...

artist
Linux is simple; use Artix, or Submit Your System To Evil Malicious D(a)emons

Re: etho keeps dropping and can't restart

Reply #8
Please also test linux-6.13 in gremlins.
My first test with it seems to have the problem fixed.

Re: etho keeps dropping and can't restart

Reply #9
Please also test linux-6.13 in gremlins.
My first test with it seems to have the problem fixed.

[ruben@www3 ~]$ uname -a
Linux www3 6.6.69-1-lts #1 SMP PREEMPT_DYNAMIC Fri, 03 Jan 2025 16:57:23 +0000 x86_64 GNU/Linux

that is what is running now and it seems to be fixed at the moment.

I am sorry for being late getting back.  I seeme to have been overwhelmed with patients at at work and draggung home exhausted.

Re: etho keeps dropping and can't restart

Reply #10
actually - I al a little confused

Code: [Select]
[ruben@www3 ~]$ sudo pacman -Q linux
linux 6.12.8.artix1-1


Code: [Select]
[ruben@www3 ~]$ uname -a
Linux www3 6.6.69-1-lts #1 SMP PREEMPT_DYNAMIC Fri, 03 Jan 2025 16:57:23 +0000 x86_64 GNU/Linux

say what??!!

I never seemed to get 6.13 up
This is my main webserver and it is on a microcomputer by fit/pc and I hate to climb into the closet to restart it :)  If it doesn't reboot and sshd doesn't come up, it is a drag.

Re: etho keeps dropping and can't restart

Reply #11

This is likely because grub (or other bootloader) is not using the kernel installed by pacman. If so then also the kernel modules installed by by pacman are not available to the running kernel. (though it's possible you still have the older/running kernels modules as well. Take a look in /usr/lib/modules.
There's a few ways this can happen but one example is grub booting from a partition which doesn't get mounted by the main system so pacman never updates the actual kernel in that partition.

Re: etho keeps dropping and can't restart

Reply #12
when did the mirrors update.  I just did an update and linux-lts did a jump from

Code: [Select]
 sudo pacman -S linux-lts
warning: linux-lts-6.6.69-1 is up to date -- reinstalling
resolving dependencies...
looking for conflicting packages...

to

Code: [Select]
[ruben@www3 ~]$ sudo pacman -S linux-lts
resolving dependencies...
looking for conflicting packages...

Packages (1) linux-lts-6.12.16-1

To me that is very confusing.  How is updating the mirror list producing this drastic change in versions?  The mirrors, new or not, are supposed to be all synchonized?

Re: etho keeps dropping and can't restart

Reply #13
actually - I al a little confused

This is likely because grub (or other bootloader) is not using the kernel installed by pacman. If so then also the kernel modules installed by by pacman are not available to the running kernel. (though it's possible you still have the older/running kernels modules as well. Take a look in /usr/lib/modules.
There's a few ways this can happen but one example is grub booting from a partition which doesn't get mounted by the main system so pacman never updates the actual kernel in that partition.


I have dozens of module trees,,, which is a shock!

Code: [Select]
total 320
drwxr-xr-x   4 root root   4096 Jan 13 00:06 6.6.69-1-lts
drwxr-xr-x   3 root root   4096 Jan 13 00:06 6.6.47-1-lts
drwxr-xr-x   4 root root   4096 Jan 13 00:06 6.12.8-artix1-1
drwxr-xr-x   3 root root   4096 Jan 13 00:06 6.10.6-artix1-1
drwxr-xr-x 161 root root 143360 Jan 13 00:06 ..
drwxr-xr-45 root root   4096 Jan 13 00:05 .
drwxr-xr-x   3 root root   4096 Aug 28  2024 6.6.42-1-lts
drwxr-xr-x   3 root root   4096 Aug 28  2024 6.10.2-artix1-1
drwxr-xr-x   3 root root   4096 Aug  4  2024 6.9.7-artix1-1
drwxr-xr-x   3 root root   4096 Aug  4  2024 6.6.36-1-lts
drwxr-xr-x   3 root root   4096 Jul  4  2024 6.7.6-artix1-1

All the way to
drwxr-xr-x   2 root root   4096 Mar 16  2018 4.14.19-1-lts
drwxr-xr-x   2 root root   4096 Mar 16  2018 4.14.18-1-lts
drwxr-xr-x   2 root root   4096 Mar 16  2018 4.14.13-1-ARTIX



 

Re: etho keeps dropping and can't restart

Reply #14

[ruben@www3 ~]$ uname -a
Linux www3 6.12.20-1-lts #1 SMP PREEMPT_DYNAMIC Sun, 23 Mar 2025 19:17:49 +0000 x86_64 GNU/LinLinux
[ruben@www3 ~]$ uname -a
Linux www3 6.12.20-1-lts #1 SMP PREEMPT_DYNAMIC Sun, 23 Mar 2025 19:17:49 +0000 x86_64 GNU/Linux
[ruben@www3 ~]$ uname -a
Linux www3 6.12.20-1-lts #1 SMP PREEMPT_DYNAMIC Sun, 23 Mar 2025 19:17:49 +0000 x86_64 GNU/Linux
[ruben@www3 ~]$

It completely crashed and burns - which is very fustrating as I wa doing a long long backup remotely

Dozens of these from dmesg:

[25275.004313] e1000e 0000:00:19.0 eth1: Detected Hardware Unit Hang:
                 TDH                  <41>
                 TDT                  <a6>
                 next_to_use          <a6>
                 next_to_clean        <3d>
               buffer_info[next_to_clean]:
                 time_stamp           <1006d330c>
                 next_to_watch        <41>
                 jiffies              <1007251c0>
                 next_to_watch.status <0>
               MAC Status             <80043>
               PHY Status             <796d>
               PHY 1000BASE-T Status  <0>
               PHY Extended Status    <3000>
               PCI Status             <10>

 
Artix forum uses a single cookie to remember youOK