Here's a Perl script that does 5 hostname lookups via
gethostbyname() which is wrapped in a 2 second alarm.
The problem is that once the alarm is triggered, *all*
subsequent lookups are failing. Short of writing a C
analog that produces a similar set of system calls,
I'm really having trouble understanding why this is
happening.
Here's the code, a sample run, and a stack trace
("www.rgee.com" is the first one that fails here):
% cat try.pl
#!/usr/bin/perl -w
use POSIX qw/:signal_h/;
sub alrm {die "timeout"}
use constant FLAGS => 0; # SA_RESTART; #SA_RESETHAND;# ???
# set up the requisite Perl objects for sigaction a'la POSIX manpage
$sigset = POSIX::SigSet->new(SIGALRM);
$sigact = POSIX::SigAction->new("main::alrm", $sigset, FLAGS);
for (<DATA>) {
chomp;
eval { # trap the alarm
sigaction(SIGALRM, $sigact); # install the signal handler
alarm 2;
$addr = gethostbyname $_; # kaboom
alarm 0;
};
else {
$addr = $addr ? join '.', unpack('C4', $addr) : "BZZT";
print "$_ => $addr\n";
}
}
__END__
www.microsoft.com
www.aol.com
www.rgee.com
www.aol.com
www.microsoft.com
% ./try.pl
www.microsoft.com => 207.46.230.220
www.aol.com => 64.12.149.24
www.rgee.com => timeout at /tmp/try.pl line 4, <DATA> line 83.
www.aol.com => timeout at /tmp/try.pl line 4, <> line 83.
www.microsoft.com => timeout at /tmp/try.pl line 4, <> line 83.
% strace ./try.pl
...
send(5, "<\233\1\0\0\1\0\0\0\0\0\0\3www\4rgee\3com\0\0\1\0\1", 30, 0) = 30
gettimeofday({1003909154, 795857}, NULL) = 0
poll([{fd=5, events=POLLIN}], 1, 5000) = -1 EINTR (Interrupted system call)
--- SIGALRM (Alarm clock) ---
rt_sigprocmask(SIG_SETMASK, [RT_0], NULL, 8) = 0
write(1, "www.rgee.com => timeout at /tmp/"...,) = 58
write(1, "\n", 1
) = 1
rt_sigaction(SIGALRM, {0x40065f70, [], SA_RESTART|0x4000000},
{0x40065f70, [], 0x4000000}, 8) = 0
rt_sigaction(SIGALRM, {0x40065f70, [], 0x4000000}, NULL, 8) = 0
alarm(2) = 0
rt_sigprocmask(SIG_SETMASK, NULL, [RT_0], 8) = 0
rt_sigsuspend([] <unfinished ...>
--- SIGALRM (Alarm clock) ---
<... rt_sigsuspend resumed> ) = -1 EINTR (Interrupted system call)
rt_sigprocmask(SIG_SETMASK, [RT_0], NULL, 8) = 0
write(1, "www.aol.com => timeout at /tmp/t"...,) = 57
write(1, "\n", 1
) = 1
rt_sigaction(SIGALRM, {0x40065f70, [], SA_RESTART|0x4000000},
{0x40065f70, [], 0x4000000}, 8) = 0
rt_sigaction(SIGALRM, {0x40065f70, [], 0x4000000}, NULL, 8) = 0
alarm(2) = 0
rt_sigprocmask(SIG_SETMASK, NULL, [RT_0], 8) = 0
rt_sigsuspend([] <unfinished ...>
--- SIGALRM (Alarm clock) ---
...
The code misbehaves as above for Perl versions 5.00503 and up, and
both 2.2 and 2.4 linux kernels. However, it apparently works fine
on Solaris (the subsequent gethostbyname calls work ok even after the
alarm goes off).
I've tried coding it using various flags for sigaction (SA_RESTART,
SA_RESETHAND, ...) but the behavior remains unchanged. I originally
asked this question a week or so ago in clp.misc, but nobody there
provided an explanation for this behavior. I'm obviously missing
something; can anyone help me out here?
--
Joe Schaefer