Sporadic SIGPIPE errors in OSE 5.0.5 -- kernel tuning needed?

Sporadic SIGPIPE errors in OSE 5.0.5 -- kernel tuning needed?

Post by imi_.. » Sat, 12 Aug 2000 04:00:00



08/11/2K 08:07AM

Cross posting to comp.unix.sco.programmer and comp.unix.sco.misc as I'm
not sure which is most appropriate.

The short version: Running SCO OSE 5.0.5, perl5 (5.0 patchlevel 5
subversion 3), fetchmail 5.4.4 and trestlemail .82 results in sporadic
"SIGPIPE thrown from an MDA or a stream socket error" on even small
messages, and not always the same message each time. The work around at
the moment (not yet tested in a production environment, so reliability
is still a concern) seems to be to specify in the .fetchmailrc an MDA
of:


It's the addition of the "trace -o /dev/null" at the beginning that
seems to make all the difference. Otherwise, I'm able to duplicate the
SIGPIPE error quite consistently.

HERE'S THE QUESTION: What (if any) SCO OSE 5.0.5 tunable parameter
could affect SIGPIPE signals?

The long version -- based on the following from Scott Gifford posted to
the fetchmail friends mailing list:

Quote:>If you're seeing the same problem with my Perl one-liner as you are
>with your own MDA, it demonstrates that the problem is extremely
>likely to be with the client exiting before it has read all of a
>message.  The fact that even my deliberately broken MDA succeeds on
>some messages and fails on others indicates that if your MDA is
>similarly broken, you may still see success under some circumstances.

I've been working extensively with Scott Bronson, the author of
trestlemail on this. We don't think it's the MDA itself; we can't see
that it's the piece that's failing. More below.

Quote:>Bottom line:  Look at your MDA program, that's most likely where the
>problem lies.  Maybe add some good debugging code to it, so you can
>tell when it's failing.

Scott Bronson suggested I add some debugging/logging code that involved
creating a fifo pipe. Briefly here's what I did: After the "# OK let's
go..." comment line that is very early in the trestlemail code, I
added:

   #next line is patch by ScottB to me to debug SIGPIPE error
   open(OUT, ">/tmp/tmout") or die "Could not open /tmp/tmout: $!\n";
   print OUT "This is a message sent immediately after open of
tmout.\n";

I then run fetchmail and at the point where it says:

   fetchmail: reading message 1 of 5 (750 octets)

and it executes the trestlemail MDA (_not_ using the "trace -o
/dev/null" workaround above), I then must:

   cat /tmp/tmout

to the fifo pipe for each message to be downloaded. Result? No SIGPIPE
has ever occurred in the hundred or so tests I've run by doing the cat
above manually. BUT, if I write a script to do the catting:

   until false ; do cat /tmp/tmout ; done

the SIGPIPE happens during every session, and typically (but not
always) on the first message (whereas without this bit of debugging
code, the SIGPIPE happens normally around the third message). BUT, if I
modify the above script to have it sleep:

   until false ; do sleep 3 ; cat /tmp/tmout ; done

fetchmail runs just fine with no SIGPIPE at all.

So, at the moment some SCO kernel parameter or "feature" :) (or maybe
even perl?) seems to be suspect but I have no idea where to look, or
even if tools exist to pinpoint the problem further. Ideas, anyone?

--
Todd Andrews

Sent via Deja.com http://www.deja.com/
Before you buy.