From: Sam Vilain (sam_at_vilain.net)
Date: Mon 24 Feb 2003 - 03:01:30 GMT
On Sat, 22 Feb 2003 22:39, DaveC wrote:
> >>EIP; c0123c30 <sys_release_ip_info+20/50> <=====
>
> Trace; c011a9a8 <session_of_pgrp+48/70>
> Trace; c016f1f6 <tiocspgrp+66/90>
> Trace; c016f5e8 <tty_ioctl+258/370>
> Trace; c0147217 <sys_ioctl+217/230>
> Trace; c010883b <system_call+33/38>
> Code; c0123c30 <sys_release_ip_info+20/50>
> 00000000 <_EIP>:
> Code; c0123c30 <sys_release_ip_info+20/50> <=====
> 0: 8b 01 mov (%ecx),%eax <=====
> Code; c0123c32 <sys_release_ip_info+22/50>
> 2: 48 dec %eax
> Code; c0123c33 <sys_release_ip_info+23/50>
> 3: 85 c0 test %eax,%eax
Blast; I hadn't even tested chbind, just the sched functionality.
However, this is still interesting. The exception is happening in
sys_release_ip_info, during the reference count decrement:
if (ip_info != NULL){
down_write (&uts_sem);
ip_info->refcount--; <==== here
if (ip_info->refcount == 0){
// printk ("vfree s_info %d\n",p->pid);
vfree (ip_info);
}
up_write (&uts_sem);
}
It appears that the ip_info structure has already been deallocated; which
you would not expect until you had closed all the processes in that
context. There's a counter being decremented incorrectly somewhere...
However, my patch doesn't come near that code path at all, and this happens
with or without --flag sched turned on. IMHO there is a race condition
here anyway; the semaphore is probably in a bad place. But I'm having
difficulty thinking how it got called twice in the first place.
Thanks for the report; I'll look into it and come back with a more fully
tested patch.
-- Sam Vilain, sam_at_vilain.net"God is a comedian playing to an audience too afraid to laugh." - Voltaire -