NTP & HP-UX 11

NTP & HP-UX 11

Post by Art D'Ada » Fri, 05 Feb 1999 04:00:00



I seems that NTP for HP-UX 11 runs only in STEP mode, not SLEW mode.

For HP-UX 10.20, there's a patch that puts NTP in SLEW mode. No
similar patch exists for 11.

I'm still trying to resolve with HP why this is so, but my question
is: does any one run databases on an 11 host with NTP? If so,
has NTP stepping back the time caused database problems?

Post and if you can email responses too to

Thanks.

 
 
 

NTP & HP-UX 11

Post by David Dalt » Sat, 06 Feb 1999 04:00:00


:>I seems that NTP for HP-UX 11 runs only in STEP mode, not SLEW mode.

:>For HP-UX 10.20, there's a patch that puts NTP in SLEW mode. No
:>similar patch exists for 11.

PHNE_12689 is not a general distribution patch.  The slew behavior that it
provides is non-standard, and not recommended for most customers.  We will
not be making the slew behavior standard is any forseeable release of HP-UX
because it has less precision and dramatically less stability that the STEP
behavior.

There might someday be an HP-UX 11.x patch that provides the SLEW behavior,
but in the meantime you can just extract the "xntpd" binary from PHNE_12689
(probably using "tar xf PHNE_12689.depot") and it will run just fine on
HP-UX 11.x with the SLEW behavior you desire.  This is what "backward
compatibility" is about, and we have a very good backward compatibility
story on HP-UX 11.x.

:>I'm still trying to resolve with HP why this is so, but my question
:>is: does any one run databases on an 11 host with NTP? If so,
:>has NTP stepping back the time caused database problems?

The slew behavior is a kludge, a band-aid covering up the real problem.  If
you have a real source of time and good network connections, your system
will never get outside the 128 millisecond window and thus never step at
all.  For example, I have stratum-4 clients that are 5000 kilometers away
from my HP58503 GPS receiver that have never been 50 milliseconds off and
never made a step (except at startup) in the several years they have been
running NTP.

The key to a successful NTP network is thoughtful configuration.  Steps in
time are very disruptive (as you know), and should never occur with a good
configuration.  My experience is that poor network connections cause most
of the problems in this area.  Many applications (FTP, sendmail) are quite
tolerant of network delays, dropped packets, retransmissions, etc.  NTP is
not tolerant.  For many people NTP is the first application that exposes
networking problems that have remained hidden until now.

And of course I would be remiss if I didn't mention that you can always
grab the latest NTP distribution from U Delaware and build it yourself with
SLEWALWAYS and any other weird parameters and customizations you desire.

--
-> My $.02 only   Not an official statement from HP {They make me say that}
--
     As far as we know, our computer has never had an undetected error.
---------------------------------------------------------------------------


 
 
 

NTP & HP-UX 11

Post by Art D'Ada » Sat, 06 Feb 1999 04:00:00




>:>I'm still trying to resolve with HP why this is so, but my question
>:>is: does any one run databases on an 11 host with NTP? If so,
>:>has NTP stepping back the time caused database problems?

>The slew behavior is a kludge, a band-aid covering up the real problem.  If
>you have a real source of time and good network connections, your system
>will never get outside the 128 millisecond window and thus never step at
>all.  For example, I have stratum-4 clients that are 5000 kilometers away
>from my HP58503 GPS receiver that have never been 50 milliseconds off and
>never made a step (except at startup) in the several years they have been
>running NTP.

David,

Thanks for the info. One point: what about a site like ours where the
only access our HP hosts have to an NTP time server is through our
Firewall? Suppose the Firewall goes down for a few hours and our
hosts drift a few seconds ahead. When the Firewall comes back up
we do not want any hosts to step back a few seconds, as I've heard
stepping backwards in time can cause some serious database problems.

So what do you recommend in this case, aside from slewing?

I would be nice if our hosts had direct access to multiple time
servers, but even then suppose the outside network connection
goes down?

Art

 
 
 

NTP & HP-UX 11

Post by Tom Lan » Mon, 08 Feb 1999 04:00:00



> Thanks for the info. One point: what about a site like ours where the
> only access our HP hosts have to an NTP time server is through our
> Firewall? Suppose the Firewall goes down for a few hours and our
> hosts drift a few seconds ahead.

Art,

The right answer is not to let that happen.  If you are running
a correctly configured NTP daemon on your client machines, it
will acquire a pretty good idea of the machine's native clock
error and will manage to keep good time even through many hours
connection outage.

I used to run an NTP daemon on the HP715 I'm typing this on, using
no local time reference and only a couple of remote servers ---
via a dialup modem connection that was only up a few hours per
day.  The 715's native clock error is close to 1.5 sec/day (which
HP ought to be embarrassed by, but nevermind).  The daemon kept
good time anyway, since it would continue to slew the clock by
what it had estimated to be the clock error even when it couldn't
contact a time server, which was most of the time.

I had to drop the daemon when I got a dial-on-demand ISDN setup,
since I didn't want the ISDN router to bring up the line every 15
minutes in response to ntp pings.  Now I run "ntpdate" three times
a day to keep my clock within half a second or so of real time ---
but I got much better timekeeping performance from ntpd.  It was
rarely off by 100msec even after many hours offline.

                        regards, tom lane

 
 
 

NTP & HP-UX 11

Post by Art D'Ada » Mon, 08 Feb 1999 04:00:00


Thank you David & Tom for your responses.

One point I still fail to understand is this: in a perfect
world, the local host would never get fast or slow, but
given that a host may sooner or later get fast or slow,
why is the STEP method of correction favored over
the SLEW method by ntp? It seems to me that
setting a host backward in time is inheriently dangerous,
so why isn't the SLEW method the default method?

Art

 
 
 

NTP & HP-UX 11

Post by Tom Lan » Tue, 09 Feb 1999 04:00:00



> why is the STEP method of correction favored over
> the SLEW method by ntp? It seems to me that
> setting a host backward in time is inheriently dangerous,
> so why isn't the SLEW method the default method?

Huh?  Slew *is* the favored method.  Step is only applied
if the error is unreasonably large and you'd be spending
a long time slewing.

Now you can quibble with the size of ntp's threshold for an
"unreasonably large" error.  128 msec is indeed unreasonably
large for a setup where the time server is always available
over a reliable network, but if you have long service outages
perhaps it is too tight.  Still, if your clock is minutes or
hours off, you'd want it fixed NOW, not slewed for the next
several fortnights.

True, stepping the clock backwards can confuse some programs;
for instance "make" might get confused about whether .o files
are up to date with respect to .c files; but it's not the end
of the world.  If you have reason to run ntp then you probably
have problems that occur when machine time gets too far away
from real time, so at some point you are probably going to
prefer an immediate fix via step to a "safe" slew.

Last I heard, you can force the daemon never to step with a
suitable command-line switch, if you are certain that the
error will never get intolerably large.  You could also go in
and tweak the step threshold, but that might take a recompile.

                        regards, tom lane

 
 
 

NTP & HP-UX 11

Post by David Dalt » Wed, 10 Feb 1999 04:00:00


:>Thank you David & Tom for your responses.

:>One point I still fail to understand is this: in a perfect
:>world, the local host would never get fast or slow, but
:>given that a host may sooner or later get fast or slow,
:>why is the STEP method of correction favored over
:>the SLEW method by ntp? It seems to me that
:>setting a host backward in time is inheriently dangerous,
:>so why isn't the SLEW method the default method?

Because it is less accurate.  The maximum slew rate depends on your
hardware, but on my workstation it is about 40 milliseconds per second.  If
you are off by 5 minutes (say at startup), then it can take 125 minutes
just to slew into nominal range and much longer than that to stabilize.
This is simply unacceptable, so a line must be drawn somewhere that says
"this offset is too big to correct by SLEWING".  That dividing line is set
at 128 milliseconds.

The SLEW method is the default when you are inside the 128 millisecond
window.  In the world of precision timekeeping 128 milliseconds is a BIG
offset (think of it as 128 million nanoseconds) and a "normal" NTP user
would not tolerate being that far off and sloooowly slewing back to the
correct time.  Especially because slew rate is limited, you might NEVER get
anywhere close to the correct time by this method.  That is considered
"bad".

So I would say that the first regime of the NTP user is where the
timesources are good and steady, and no steps ever happen at all.

The lower regime of the NTP user has not much concern for precision, and
it is adequate to simply run "ntpdate" periodically to keep the local
clock nominally correct.

You seem to be desiring an intermediate regime where precision requirements
are quite relaxed, the timesources are flaky, the network connections are
flaky, but the non-negotiable requirement is that a backward step must
_never_ happen.  I understand your concern, but I think you can imagine
this scenario has not gotten a lot of attention in the world of precision
timekeeping.  NTP is not optimized for this situation, it is optimized for
nanosecond precision.

I think you can get a SLEW version of "ntpdate" to solve your problem.

--
-> My $.02 only   Not an official statement from HP {They make me say that}
--
     As far as we know, our computer has never had an undetected error.
---------------------------------------------------------------------------

 
 
 

NTP & HP-UX 11

Post by David Dalt » Wed, 10 Feb 1999 04:00:00


:>Last I heard, you can force the daemon never to step with a
:>suitable command-line switch, if you are certain that the
:>error will never get intolerably large.  You could also go in
:>and tweak the step threshold, but that might take a recompile.

I guess the problem is that SLEWALWAYS is a compile-time switch, not a
run-time switch or command-line option.  

Many people have asked for the slew behavior without realizing the
instabilities involved.  I think of slewing like a fish that you have just
hauled into the boat.  You are trying to get the fish clamped down, but it
is flopping all over the place.  It is very hard to predict where it will
flop next, and very dangerous if the speed and range of the flopping are
not constrained.  The fish might flop right back out of the boat in some
cases.

To prevent this wild flopping about, NTP places strict limits on the
maximum slew rate and the maximum slew range.  This damps and constrains
the excursions, and guarantees convergence.  Quite a lot of careful
engineering went into analyzing, characterizing, and testing this dynamic
system, and my hat is off to Professor Mills and his cronies who did the
work.  Revisit your textbook on control theory if you have forgotten how
complex this is.

It is possible to recompile NTP and change some of these parameters, but
you do this at your peril.  In particular, changing the slew range from 128
milliseconds to infinite means that you might never converge on the correct
time at all.  Add the possibility that all your timesources might be
unavailable for extended periods (while slewing continues) and the flopping
can get arbitrarily wild.  This is no way to run your NTP hieararchy.

If you care so little about correct time that you are willing to accept the
possibility of unbounded oscillations, then you shouldn't be running
NTP at all.  Just turn your system on and let the clock free-run, and the
time will always be monotonically increasing with no backward steps.  You
can set the clock at power-up using eyeball-and-wristwatch method, or
perhaps "ntpdate" before your database starts up, and then just live with
the drift rate until the next power-on reset.  The drift is far less
dangerous than unbounded oscillations.

--
-> My $.02 only   Not an official statement from HP {They make me say that}
--
     As far as we know, our computer has never had an undetected error.
---------------------------------------------------------------------------

 
 
 

NTP & HP-UX 11

Post by Philip Hombur » Thu, 11 Feb 1999 04:00:00




>I guess the problem is that SLEWALWAYS is a compile-time switch, not a
>run-time switch or command-line option.  
>It is possible to recompile NTP and change some of these parameters, but
>you do this at your peril.  In particular, changing the slew range from 128
>milliseconds to infinite means that you might never converge on the correct
>time at all.  Add the possibility that all your timesources might be
>unavailable for extended periods (while slewing continues) and the flopping
>can get arbitrarily wild.  This is no way to run your NTP hieararchy.
>If you care so little about correct time that you are willing to accept the
>possibility of unbounded oscillations, then you shouldn't be running
>NTP at all.  Just turn your system on and let the clock free-run, and the
>time will always be monotonically increasing with no backward steps.  You
>can set the clock at power-up using eyeball-and-wristwatch method, or
>perhaps "ntpdate" before your database starts up, and then just live with
>the drift rate until the next power-on reset.  The drift is far less
>dangerous than unbounded oscillations.

This is true for the algorithm used by ntpd. However, it seems quite possible
to design other algorithms that are compatible with the ntp protocol but
without this drawback. These algorithms may not offer nanosecond accuracy,
but not everybody needs that.

Another approach is to select a set of parameters for ntpd that allow this
constant to be raised to a larger value, say 10 seconds.

                                        Philip Homburg

 
 
 

1. NTP 3.5 on HP-UX && NETWARE 4.1

We are running NTP 3.5 on 20 HP-UX servers and wish to synchronise a
number of COMPAQ NETWARE 4.1 OA servers to them using NTP. Can anyone
please tell me:-

     1.     If this is possible
     2.     Where to get, and what the software is, to do it
     3.     How to set up the netware servers to run it properly

Thanks,

Dave


--
David Moor

2. Where is soft87?

3. HP-UX 11.00, Trimble Acutime 2000 and NTP

4. What Does This Mean & How Do I Troubleshoot?

5. problem with resolver under HP-UX 11

6. Hypertext (was Re: Intro. book on Eiffel Programming...?)

7. problem with ntptimeset on HP-UX 11

8. elvis vs. linux

9. Many adjtime calls on HP-UX 11

10. NTP synchronisation with HP-UX

11. Netflow Analyzer on HP-UX 11.0

12. HP-UX 11.00 compile problem

13. Bind 8.1.2 availability for HP-UX 11.00?