About this list Date view Thread view Subject view Author view Attachment view

From: Paul Sladen (vserver_at_paul.sladen.org)
Date: Thu 07 Nov 2002 - 13:13:03 GMT


On Thu, 7 Nov 2002, Nuno Silva wrote:

> My advice is: "upgrade" to 2.4.18 :)
> My current record with 2.4.19 with moderate load but high I/O is 12 days

Both my production vserver boxes suffer from the `hangs' Cathy has
described.

PIV 1.6Ghz, 1GB [no highmem], ext3, LVM, IDE Soft-"RAID"
2.4.19-pre7-ctx10-ide20020510 is my ``more reliable'' so far...

  75 days, 09:25:41 | Linux 2.4.19-pre7-ctx1 Sun Jul 14
   0 days, 09:21:45 | Linux 2.4.19-pre7-ctx1 Sat Sep 28
  38 days, 22:20:20 | Linux 2.4.19-pre7-ctx1 Sun Sep 29

On one occasion I found ``out of file handles'' in the terminal-server
scroll--now I monitor that. As Cathy points out monitoring/logwriting (and
presumably the processes trying to do it) completely stop when it ends up
in this state.

PIII 700Mhz, 192MB [no highmem ;-)], ext2, md+SCSI, IDE
2.4.18ctx-10 is my ``less reliable'' box, vis:

   9 days, 22:42:04 | Linux 2.4.18ctx-10 Tue May 14
  56 days, 22:51:47 | Linux 2.4.18ctx-10 Fri May 24
  32 days, 21:26:16 | Linux 2.4.18ctx-10 Sat Jul 20
  13 days, 13:29:46 | Linux 2.4.18ctx-10 Thu Aug 22
  10 days, 05:49:08 | Linux 2.4.18ctx-10 Tue Sep 17
   3 days, 06:01:25 | Linux 2.4.18ctx-10 Thu Sep 5
   5 days, 20:31:13 | Linux 2.4.18ctx-10 Sun Sep 8
   4 days, 05:47:06 | Linux 2.4.18ctx-10 Mon Sep 30
  22 days, 17:39:13 | Linux 2.4.18ctx-10 Sat Oct 5
   9 days, 21:37:55 | Linux 2.4.18ctx-10 Mon Oct 28

This last one was a genuine Oops that spewed (not rebootable with [break] on
the serial console; most [all?] of the rest (sadly too many to count...)
have been hangs where it returns ICMP request and half-opens TCP connections
and can be rebooted with sysreq from the serial console.

The softdog (kernel/userspace watchdog) cannot be persuaded to reboot the
machines when they end up in this state; although the kernel--not having
received an update from userspace--should reboot! And that is with
*everything* turned on (completely paranoid state) in the watchdog program.

        -Paul

-- 
Nottingham, GB


About this list Date view Thread view Subject view Author view Attachment view
[Next/Previous Months] [Main vserver Project Homepage] [Howto Subscribe/Unsubscribe] [Paul Sladen's vserver stuff]
Generated on Thu 07 Nov 2002 - 14:28:02 GMT by hypermail 2.1.3