From: Herbert Poetzl (herbert_at_13thfloor.at)
Date: Wed 15 Jan 2003 - 22:50:57 GMT
On Wed, Jan 15, 2003 at 11:10:55PM +0200, vherva_at_turing.netspan.fi wrote:
> [Please Cc, although I will poll the archive]
I currently do not have the time to look into it,
but I would suggest checking the /etc/nsswitch against
your concept of name service resolving regarding
yellow pages (nis, nis+) and of course /etc/resolv.conf
and your nameservices ... this looks like some
nameservice/portmap issue ...
best,
Herbert
PS: I will take a deeper look at it tomorrow ...
>
> We recently upgraded from 2.4.20-pre7-ctx13-owl kernel to
> 2.4.21-pre2-ctx16-owl (owl stands for the Openwall Linux security
> patchset). After that, certain processes ran inside vservers begun to
> hang. For example, doing this after fresh boot of host system:
>
> host> vserver www start
> host> vserver www enter
> (...)
> www> ls -l /
> <hangs after showing a few lines>
> www> id
> <hangs>
> www> uname -a
> <works>
> www> ls /
> <works>
>
> I first reverted the owl patch and verified it happens with
> 2.4.21-pre2-ctx16 and then with 2.4.20-ctx16. After few iterations, it
> seems the problem happens with 2.4.19ctx-15, but does not happen with
> 2.4.19ctx-14 (both clean 2.4.19 with only ctx applied).
>
> Details:
>
> [root_at_vserver:www /]ls -l
> total 64
> drwxr-xr-x 2 root root 4096 Jan 6 15:45 bin
> drwxr-xr-x 2 root root 4096 Feb 6 1996 boot
> <hangs>
> [root_at_vserver:www /]id
> <hangs>
>
> Strace shows that "id" (and ls -l, for that matter) hangs looping these
> syscalls:
>
> --------------------------------------------------------------
> open("/var/yp/binding/test-host.2", O_RDONLY) = -1 ENOENT
> (No such file or directory)
> gettimeofday({1042665515, 575639}, NULL) = 0
> socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP) = 3
> getpid() = 18043
> bind(3, {sin_family=AF_INET,
> sin_port=htons(835), sin_addr=inet_addr("0.0.0.0")}}, 16) = 0
> ioctl(3, 0x5421, [1]) = 0
> setsockopt(3, SOL_IP, IP_RECVERR, [1], 4) = 0
> sendto(3,
> "\200;\351t\0\0\0\0\0\0\0\2\0\1\206\240\0\0\0\2\0\0\0\3"...,
> 56, 0, {sin_family=AF_INET, sin_port=htons(111),
> sin_addr=inet_addr("127.0.0.1")}}, 16) = 56
> poll([{fd=3, events=POLLIN, revents=POLLERR}], 1, 5000) = 1
> recvmsg(3, {msg_name(16)={sin_family=AF_INET,
> sin_port=htons(111), sin_addr=inet_addr("10.0.0.1")}},
> msg_iov(1)=[{"\200;\351t\0\0\0\0\0\0\0\2\0\1\206\240\0\0\0\2\0\0\0\3"...,
> 56}], msg_controllen=44, msg_control=0xbfffcd70, ,
> msg_flags=MSG_ERRQUEUE}, MSG_ERRQUEUE) = 56
> recvfrom(3, 0x804d140, 400, 0, 0xbffff230, 0xbfffcf14) = -1
> EAGAIN (Resource temporarily unavailable)
> poll([{fd=3,events=POLLIN}], 1, 5000) = 0 ioctl(3, 0x8912, 0xbfffcf20) = 0
> ioctl(3, 0x8913, 0xbfffcf60) = 0
> sendto(3, "\200;\351t\0\0\0\0\0\0\0\2\0\1\206\240\0\0\0\2\0\0\0\3"..., 56,
> 0, {sin_family=AF_INET, sin_port=htons(111),
> sin_addr=inet_addr("127.0.0.1")}}, 16) = 56
> poll([{fd=3, events=POLLIN, revents=POLLERR}], 1, 5000) = 1
> recvmsg(3, {msg_name(16)={sin_family=AF_INET, sin_port=htons(111),
> sin_addr=inet_addr("10.0.0.1")}},
> msg_iov(1)=[{"\200;\351t\0\0\0\0\0\0\0\2\0\1\206\240\0\0\0\2\0\0\0\3"...,
> 56}], msg_controllen=44, msg_control=0xbfffcc30, ,
> msg_flags=MSG_ERRQUEUE}, MSG_ERRQUEUE) = 56
> recvfrom(3, 0x804d140, 400, 0, 0xbffff230, 0xbfffcf14) = -1 EAGAIN
> (Resource temporarily unavailable)
> poll( <unfinished ...>
> --------------------------------------------------------------
>
> I have no clue why id and ls would want to poll a network socket. On
> ctx14, does that too, but does not hang.
>
> Full straces available at
>
> http://www.netspan.fi/tmp/strace-id-ctx14.txt
> http://www.netspan.fi/tmp/strace-id-ctx15.txt
>
> I see there's a bunch of network changes between ctx-14->15
> (http://www.netspan.fi/tmp/patch-ctx-14-15). I can't spot anything
> obvious, though.
>
> Any ideas?
>
>
> -- v --
>
> --
> Ville Herva Ville.Herva_at_netspan.fi +358-50-5164500
> Netspan Oy netspan_at_netspan.fi PL 65 FIN-02151 Espoo
> http://www.netspan.fi
> For my PGP key, see http://www.netspan.fi/pgp-vherva.html