Friday, December 21, 2007

NETDEV WATCHDOG: eth0: transmit timed out

Recently I encountered a problem with my desktop computer running Ubuntu 7.10 Gutsy Gibbon.

After some time network would die and the only way to bring it back was to restart the computer.
Nothing worked:
  • NetworkManager
  • sudo /etc/init.d/networking restart
  • reloading kernel modules
Looking into the /var/syslog I found this:
Dec 7 08:09:36 moonbiter kernel: [ 3751.383550] NETDEV WATCHDOG: eth0: transmit timed out
Dec 7 08:09:36 moonbiter kernel: [ 3751.383626] eth0: Transmit timed out, status 0003, PHY status 786d, resetting...
Dec 7 08:09:36 moonbiter kernel: [ 3751.383939] eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
I have two ethernet cards: one builtin and one in a PCI slot (lspci comes in handy):
00:12.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II] (rev 7c)
05:07.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet (rev 10)

Googling around provided with a lot of contradicting solutions. None of them worked. Finally, on one of the forums I found an advice to check IRQs. Indeed this could have been the problem. I recently bought nVidia GeForce 7300 GT and to fit it in I had to move my Realtek ethernet adapter to another PCI slot.

Looking into /proc/interrupts I found:
CPU0
0: 46332 IO-APIC-edge timer
1: 155 IO-APIC-edge i8042
6: 5 IO-APIC-edge floppy
7: 0 IO-APIC-edge parport0
8: 0 IO-APIC-edge rtc
9: 0 IO-APIC-fasteoi acpi
12: 1617 IO-APIC-edge i8042
14: 15664 IO-APIC-edge ide0
15: 1242 IO-APIC-edge ide1
17: 717 IO-APIC-fasteoi eth1, HDA Intel
20: 0 IO-APIC-fasteoi uhci_hcd:usb1
21: 0 IO-APIC-fasteoi sata_via, uhci_hcd:usb3, ehci_hcd:usb5
22: 0 IO-APIC-fasteoi uhci_hcd:usb2
23: 13202 IO-APIC-fasteoi uhci_hcd:usb4, eth0
24: 13183 IO-APIC-fasteoi nvidia
NMI: 0
LOC: 46223
ERR: 0

And in /var/syslog
Dec 7 06:59:06 moonbiter kernel: [ 1637.162744] irq 23: nobody cared (try booting with the "irqpoll" option)
Dec 7 06:59:06 moonbiter kernel: [ 1637.162748]
Dec 7 06:59:06 moonbiter kernel: [ 1637.162749] Call Trace:
Dec 7 06:59:06 moonbiter kernel: [ 1637.162751] [__report_bad_irq+30/128] __report_bad_irq+0x1e/0x80
Dec 7 06:59:06 moonbiter kernel: [ 1637.162770] [note_interrupt+643/704] note_interrupt+0x283/0x2c0
Dec 7 06:59:06 moonbiter kernel: [ 1637.162777] [handle_fasteoi_irq+221/272] handle_fasteoi_irq+0xdd/0x110
Dec 7 06:59:06 moonbiter kernel: [ 1637.162897] [_end+129724926/2130332920] :nvidia:_nv003707rm+0x1f/0x27
Dec 7 06:59:06 moonbiter kernel: [ 1637.162904] [do_IRQ+123/256] do_IRQ+0x7b/0x100
Dec 7 06:59:06 moonbiter kernel: [ 1637.162909] [ret_from_intr+0/10] ret_from_intr+0x0/0xa
Dec 7 06:59:06 moonbiter kernel: [ 1637.162914] [pci_conf1_read+0/272] pci_conf1_read+0x0/0x110
Dec 7 06:59:06 moonbiter kernel: [ 1637.162921] [__do_softirq+84/224] __do_softirq+0x54/0xe0
Dec 7 06:59:06 moonbiter kernel: [ 1637.162929] [call_softirq+28/48] call_softirq+0x1c/0x30
Dec 7 06:59:06 moonbiter kernel: [ 1637.162933] [do_softirq+53/144] do_softirq+0x35/0x90
Dec 7 06:59:06 moonbiter kernel: [ 1637.162937] [do_IRQ+128/256] do_IRQ+0x80/0x100
Dec 7 06:59:06 moonbiter kernel: [ 1637.162942] [ret_from_intr+0/10] ret_from_intr+0x0/0xa
Dec 7 06:59:06 moonbiter kernel: [ 1637.162944] [_end+127707073/2130332920] :processor:acpi_processor_idle+0x25f/0x456
Dec 7 06:59:06 moonbiter kernel: [ 1637.162964] [_end+127707063/2130332920] :processor:acpi_processor_idle+0x255/0x456
Dec 7 06:59:06 moonbiter kernel: [ 1637.162971] [_end+127706466/2130332920] :processor:acpi_processor_idle+0x0/0x456
Dec 7 06:59:06 moonbiter kernel: [ 1637.162976] [cpu_idle+112/192] cpu_idle+0x70/0xc0
Dec 7 06:59:06 moonbiter kernel: [ 1637.162982] [start_kernel+645/784] start_kernel+0x285/0x310
Dec 7 06:59:06 moonbiter kernel: [ 1637.162987] [x86_64_start_kernel+286/352] _sinittext+0x11e/0x160
Dec 7 06:59:06 moonbiter kernel: [ 1637.162991]
Dec 7 06:59:06 moonbiter kernel: [ 1637.162992] handlers:
Dec 7 06:59:06 moonbiter kernel: [ 1637.162994] [_end+128326184/2130332920] (usb_hcd_irq+0x0/0x60 [usbcore])
Dec 7 06:59:06 moonbiter kernel: [ 1637.163010] [_end+128535048/2130332920] (rhine_interrupt+0x0/0xc70 [via_rhine])
Dec 7 06:59:06 moonbiter kernel: [ 1637.163017] Disabling IRQ #23

So irq assigned to eth0 was disabled ...
Following advice from the forum I added "noapic" boot option to my /boot/grub/menu.lst
And now network is solid stable again :D.

Just to see how /proc/interrupts looks now:
CPU0
0: 145441 XT-PIC-XT timer
1: 1186 XT-PIC-XT i8042
2: 0 XT-PIC-XT cascade
4: 0 XT-PIC-XT uhci_hcd:usb3, ehci_hcd:usb5
5: 24510 XT-PIC-XT uhci_hcd:usb1, uhci_hcd:usb4, eth0
6: 5 XT-PIC-XT floppy
7: 47352 XT-PIC-XT parport0
8: 0 XT-PIC-XT rtc
9: 0 XT-PIC-XT acpi
10: 48019 XT-PIC-XT nvidia
11: 740 XT-PIC-XT sata_via, uhci_hcd:usb2, eth1, HDA Intel
12: 7125 XT-PIC-XT i8042
14: 51693 XT-PIC-XT ide0
15: 5002 XT-PIC-XT ide1
NMI: 0
LOC: 145342
ERR: 4

I'm not too happy with this solution but it works...

3 comments:

सुधांशु said...

even with the pci=noapic option my /proc/interrupts looks tha same as in the first case.. bump!

conrad said...

Hi sorry for late reply.

Here are my boot opts:
kernel /vmlinuz-2.6.24-21-generic root=UUID=a493ab07-cb5c-450d-8102-d59ec9231f38 ro noapic quiet splash

So As you can see I was quite more forceful as I used noapic instead of pci=noapic.

Cheers,
Konrad

Unknown said...

When I try to turn on noapic my nVidia doesn't work. It also doesn't appears in the /proc/interrupts file.