> We're running Apache 1.2.6 on HP-UX.
We're running Apache 1.3.4 on HP-UX 11.0.
Quote:> Each night we stop the server
> (by killing the parent process's PID), rotate the logs, sleep for
> 5 seconds to make sure all of the children are gone, and start
> the server again.
We used to do this, but also added:
* A "ps -ef | grep httpd | grep -v grep" to find anything that was still
running after 5 seconds sleep.
* A "lsof -i :80" to find out any remaining processes left that are holding
onto port 80 (lsof is a third-party program - you can find an HP-UX port
here: http://hpux.csc.liv.ac.uk/hppd/hpux/Sysadmin/lsof-4.40/ )
Quote:> This usually works fine, but occassionally, when
> the server tries to restart, it fails with the following message:
> bind: Address already in use
> httpd: could not bind to port 80
Yep, we get *exactly* the same thing with Apache and HP-UX - intermittently
failing to bind to port 80 on a stop/start.
Quote:> According to ps, there are no httpd processes to be found.
Yep and "lsof -i :80" also reports no processes holding onto port 80.
> a bit and trying to start the server results in the same error message.
> I haven't found a way to solve the problem short of a reboot.
Ditto - I think it's a kernel bug, but we've patched the kernel with anything
to do with TCP and the network, but to no avail.
Quote:> Has anyone encountered and solved this problem?
The solution is to change your scripts to use the "graceful restart" signal
instead - this is "kill -s SIGUSR1" to the parent httpd process rather than
doing a full stop and start. I've been using this for a while now and
it works fine.
It is an extremely serious problem for HP-UX users of Apache who rotate their
logs daily via a stop/start like we do - since the rotation's done at
midnight and no amount of process killing will fix it, you end up with
a Web server that can be dead to the world until it's rebooted (either
remotely or from the console).
Quote:> An explanation of why
> it restarts most days, but not others would be of great help.
I think it's just some freaky conditions in the kernel (perhaps triggered
by Apache) that cause it. Unfortunately, HP don't seem to have any patch for
Connect, WWW: http://www.connect.org.uk/