From: Herbert Poetzl (herbert_at_13thfloor.at)
Date: Thu 13 Feb 2003 - 07:58:13 GMT
On Wed, Feb 12, 2003 at 03:06:48PM -0600, John Goerzen wrote:
> Hello,
>
> A little while ago, I reported a crash with 2.4.20ctx16. Now, it has
> happened again, and I have captured much more debug info:
this one was very useful/informational ...
I would suggest to focus on NIC driver and sending
udp/(tcp) messages ...
It should be possible to trigger this issue by
sending a lot of udp/(tcp) packets via hping2
(or something similar) on systems that crashed
at least once with "kernel BUG at sched.c:570"
my guess would be that some NIC driver has some
issues with schedule in softirq (shared code)
but maybe I'm totally wrong, so do not blame me
if it is something completely different ...
#include <std_disclaimer.h>
best,
Herbert
> ksymoops 2.4.8 on i686 2.4.20ctx-16. Options used
> -V (default)
> -k /proc/ksyms (default)
> -l /proc/modules (default)
> -o /lib/modules/2.4.20ctx-16/ (default)
> -m /boot/System.map-2.4.20ctx-16 (default)
>
> kernel BUG at sched.c:570!
> invalid operand: 0000
> CPU: 1
> EIP: 0010:[<c011a201>] Not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> EFLAGS: 00010282
> eax: 000000018 ebx: 00000080 ecx: 00000004 edx: 000000000
> esi: d61ca000 edi: d61cbb6c ebp: d71cbb44 esp: d71cbb14
> ds: 0018 es: 0018 ss: 0018
> Process amavisd (pid: 5552, stackpage=d71cb000)
> Stack: c02b910a c02fc6b4 c1030020 000000207 ffffffff d71ca00 d52b0080
> 00000001 c0279edf c02fbef8 d71ca000 d71cbb6c e9733180 c02a510c 00000014
> c02fbef8 e22eeb00 d1852827 c02a4f99 c02fdef8 d71cbb6c ffffffff 00000000
> 00000000
> Call trace: [<c0279edf>] [<c02a510c>] [<c02a4f99>] [<c012c12a>] [<c02882ef>]
> [<c027ada9>]
> [<c024ab25>]
> [<c027b2f9>] [<c0260e30>] [<c0260f7e>] [<c0260e30>] [<c0258240>]
> [<c0260e30>] [<c0260fa0>] [<c0260bb3>]
> [<c0260e30>] [<c0261169>] [<c0257e74>] [<c0260fa0>] [<c0258240>]
> [<c0260fa0>] [<c0260d7a>] [<c0260fa0>]
> [<c0250c43>] [<c0250d7c>] [<c0250eb9>] [<c0122e39>] [<c02650f0>]
> [<c02587d6>] [<c02650f0>] [<c0263ef3>]
> [<c02650f0>] [<c0279d2d>] [<c02745d7>] [<c0275194>]
> [<c02696cd>] [<c0159f85>] [<c012e97d>] [<c02896b2>]
> [<c0247f65>] [<c02481fe>] [<c01429b3>] [<c01093ef>]
> Code: 0f 0b 3a 02 02 91 2b c0 e9 3f fb ff ff 0f 0b 33 02 02 91 2b
>
>
> >>EIP; c011a201 <schedule+501/530> <=====
>
> >>esi; d61ca000 <_end+15dfc91c/384b797c>
> >>edi; d61cbb6c <_end+15dfe488/384b797c>
> >>ebp; d71cbb44 <_end+16dfe460/384b797c>
> >>esp; d71cbb14 <_end+16dfe430/384b797c>
>
> Trace; c0279edf <tcp_v4_send_reset+11f/190>
> Trace; c02a510c <rwsem_down_failed_common+5c/7e>
> Trace; c02a4f99 <rwsem_down_write_failed+29/40>
> Trace; c012c12a <.text.lock.sys+ea/190>
> Trace; c02882ef <inet_sock_destruct+ff/1c0>
> Trace; c027ada9 <tcp_v4_do_rcv+139/1c0>
> Trace; c024ab25 <sk_free+85/90>
> Trace; c027b2f9 <tcp_v4_rcv+4c9/5a0>
> Trace; c0260e30 <ip_local_deliver_finish+0/170>
> Trace; c0260f7e <ip_local_deliver_finish+14e/170>
> Trace; c0260e30 <ip_local_deliver_finish+0/170>
> Trace; c0258240 <nf_hook_slow+100/1d0>
> Trace; c0260e30 <ip_local_deliver_finish+0/170>
> Trace; c0260fa0 <ip_rcv_finish+0/226>
> Trace; c0260bb3 <ip_local_deliver+53/80>
> Trace; c0260e30 <ip_local_deliver_finish+0/170>
> Trace; c0261169 <ip_rcv_finish+1c9/226>
> Trace; c0257e74 <nf_iterate+54/a0>
> Trace; c0260fa0 <ip_rcv_finish+0/226>
> Trace; c0258240 <nf_hook_slow+100/1d0>
> Trace; c0260fa0 <ip_rcv_finish+0/226>
> Trace; c0260d7a <ip_rcv+19a/250>
> Trace; c0260fa0 <ip_rcv_finish+0/226>
> Trace; c0250c43 <netif_receive_skb+d3/190>
> Trace; c0250d7c <process_backlog+7c/120>
> Trace; c0250eb9 <net_rx_action+99/140>
> Trace; c0122e39 <do_softirq+d9/e0>
> Trace; c02650f0 <ip_queue_xmit2+0/250>
> Trace; c02587d6 <.text.lock.netfilter+c0/ea>
> Trace; c02650f0 <ip_queue_xmit2+0/250>
> Trace; c0263ef3 <ip_queue_xmit+2e3/300>
> Trace; c02650f0 <ip_queue_xmit2+0/250>
> Trace; c0279d2d <tcp_v4_send_check+4d/e0>
> Trace; c02745d7 <tcp_transmit_skb+2c7/460>
> Trace; c0275194 <tcp_write_xmit+184/270>
> Trace; c02696cd <tcp_sendmsg+57d/10c0>
> Trace; c0159f85 <__mark_inode_dirty+b5/c0>
> Trace; c012e97d <do_wp_page+2bd/300>
> Trace; c02896b2 <inet_sendmsg+42/50>
> Trace; c0247f65 <sock_sendmsg+75/c0>
> Trace; c02481fe <sock_write+ae/c0>
> Trace; c01429b3 <sys_write+a3/160>
> Trace; c01093ef <system_call+33/38>
>
> Code; c011a201 <schedule+501/530>
> 00000000 <_EIP>:
> Code; c011a201 <schedule+501/530> <=====
> 0: 0f 0b ud2a <=====
> Code; c011a203 <schedule+503/530>
> 2: 3a 02 cmp (%edx),%al
> Code; c011a205 <schedule+505/530>
> 4: 02 91 2b c0 e9 3f add 0x3fe9c02b(%ecx),%dl
> Code; c011a20b <schedule+50b/530>
> a: fb sti
> Code; c011a20c <schedule+50c/530>
> b: ff (bad)
> Code; c011a20d <schedule+50d/530>
> c: ff 0f decl (%edi)
> Code; c011a20f <schedule+50f/530>
> e: 0b 33 or (%ebx),%esi
> Code; c011a211 <schedule+511/530>
> 10: 02 02 add (%edx),%al
> Code; c011a213 <schedule+513/530>
> 12: 91 xchg %eax,%ecx
> Code; c011a214 <schedule+514/530>
> 13: 2b 00 sub (%eax),%eax
>
> <0>Kernel panic: Aiee, killing interrupt handler!
>
> 1 warning issued. Results may not be reliable.