What's magic about 7.32 seconds?

What's magic about 7.32 seconds?

Post by Dave Littel » Tue, 04 Nov 2003 06:07:05



Hi all,

I'm seeing something jump into the processor for almost exactly a
millisecond every 7.32 seconds.  It does happen in single-user mode,
but I don't recall seeing this in earlier tests.  Clearly I turned some-
thing on I shouldn't have or turned something off that someone still
wants to see.  Killing vold got rid of a 3 mSec chirp every 3 seconds
but now this one crops up.  Urg... ;-)

Any ideas?

Thanks,
Dave

 
 
 

What's magic about 7.32 seconds?

Post by Gavin Maltb » Wed, 05 Nov 2003 05:27:00



> I'm seeing something jump into the processor for almost exactly a
> millisecond every 7.32 seconds.  It does happen in single-user mode,
> but I don't recall seeing this in earlier tests.  Clearly I turned some-
> thing on I shouldn't have or turned something off that someone still
> wants to see.  Killing vold got rid of a 3 mSec chirp every 3 seconds
> but now this one crops up.  Urg... ;-)

How are you detecting this activity?

This sounds like it may be either fsflush or memscrub.  fsflush wakes
up every 5s by default as far as I remember.  memscrub wakes up
at odd looking intervals in order to scrub all of physical memory
over a 12 hr period.  Try disabling the latter (see Sunsolve)
temporarily to confirm.

When you care about other sharing a cpu with you the approach
is to set aside processors in a processor set (psrset(1M)) and
set all those no-intr (no device interrupts).  Now only threads
that you launch to that processor set (and their children) run
there, and this includes the service kernel threads such as
fsflush etc.

Gavin

 
 
 

What's magic about 7.32 seconds?

Post by Dave Littel » Wed, 05 Nov 2003 08:57:47




>> I'm seeing something jump into the processor for almost exactly a
>> millisecond every 7.32 seconds.  It does happen in single-user mode,
>> but I don't recall seeing this in earlier tests.  Clearly I turned some-
>> thing on I shouldn't have or turned something off that someone still
>> wants to see.  Killing vold got rid of a 3 mSec chirp every 3 seconds
>> but now this one crops up.  Urg... ;-)

> How are you detecting this activity?

A real-time task sends a timestamped message to another task at a
periodic rate.  The receiving task timestamps the receipt of the message
and keeps stats.  It looks really good except the these little chirps that
are showing up...

Quote:> This sounds like it may be either fsflush or memscrub.  fsflush wakes
> up every 5s by default as far as I remember.  memscrub wakes up
> at odd looking intervals in order to scrub all of physical memory
> over a 12 hr period.  Try disabling the latter (see Sunsolve)
> temporarily to confirm.

I'd love to, but I have no SunSolve access.  I'm just a lowly
Solaris x86 user with no service contract.  :-(

I am evaluating Solaris for potential use at work, but I need to get
past some of the learning curve to know what to turn off to prevent
these chirps, even on a uniprocessor.  On a multiprocessor, we'd
certainly want to minimize the noise, but I can't find any info at all
on what to turn off (and how).  Someone else was kind enough to
mention disabling vold to get rid of a 3mSec/3Sec chirp, but I
guess there's lots more...

Quote:> When you care about other sharing a cpu with you the approach
> is to set aside processors in a processor set (psrset(1M)) and
> set all those no-intr (no device interrupts).  Now only threads
> that you launch to that processor set (and their children) run
> there, and this includes the service kernel threads such as
> fsflush etc.

Good info for when I can get a multiprocessor system - thanks.

Dave

 
 
 

What's magic about 7.32 seconds?

Post by Gavin Maltb » Wed, 05 Nov 2003 19:30:33





>>> I'm seeing something jump into the processor for almost exactly a
>>> millisecond every 7.32 seconds.  It does happen in single-user mode,
>>> but I don't recall seeing this in earlier tests.  Clearly I turned some-
>>> thing on I shouldn't have or turned something off that someone still
>>> wants to see.  Killing vold got rid of a 3 mSec chirp every 3 seconds
>>> but now this one crops up.  Urg... ;-)

>> How are you detecting this activity?

> A real-time task sends a timestamped message to another task at a
> periodic rate.  The receiving task timestamps the receipt of the message
> and keeps stats.  It looks really good except the these little chirps that
> are showing up...

This *may* be bug 4912731 in which the memscrub thread can overstay
it's welcome on cpu for a moment too long, and can therefore interfere
with realtime dispatch if realtime has not been isolated to it's own
CPU set.  This affects both sparc and x86 Solaris (although the history
to each is different).  A workaround is to reduce the memscrub span
but let's verify the cause first:

You can cause the memscrub to exit on a booted x86 system with

# echo "disable_memscrub/W1" | mdb -kw

Note that you can't start it again.  You can make the thread never
start by setting that in /etc/system (of course you then have
nothing keeping memory fresh).

Try that and see if the blip disappears.

Alternatively, here's how the memscrub wakeup interval is computed
on x86:

Obtain values for the following with the commands shown:

memscrub_period_sec     # echo "memscrub_period_sec/U" | mdb -k
memscrub_span_pages     # echo "memscrub_span_pages/U" | mdb -k
memscrub_phys_pages     # echo "memscrub_phys_pages/U" | mdb -k

The first two have default values, the last depends on your installed
physical memory size.  Now:

for memscrub_phys_pages <= memscrub_span_pages, use memscrub_period_sec
otherwise use  memscrub_phys_pages/memscrub_span_pages

The result gives the interval in seconds of memscrub wakeups
(provided it isn't falling behind).  Is this the magic 7.32s?

Quote:>> This sounds like it may be either fsflush or memscrub.  fsflush wakes
>> up every 5s by default as far as I remember.  memscrub wakes up
>> at odd looking intervals in order to scrub all of physical memory
>> over a 12 hr period.  Try disabling the latter (see Sunsolve)
>> temporarily to confirm.

> I'd love to, but I have no SunSolve access.  I'm just a lowly
> Solaris x86 user with no service contract.  :-(

And you managed to get the owner of the above bug - not bad :-)

Quote:> I am evaluating Solaris for potential use at work, but I need to get
> past some of the learning curve to know what to turn off to prevent
> these chirps, even on a uniprocessor.  On a multiprocessor, we'd
> certainly want to minimize the noise, but I can't find any info at all
> on what to turn off (and how).  Someone else was kind enough to
> mention disabling vold to get rid of a 3mSec/3Sec chirp, but I
> guess there's lots more...

The vold sounds like a bug, too, if it's interfering with realtime
dispatch.

Realtime on a uniprocessor should work, but relies on all parties
playing the game correctly.  The only things of higher priority
than realtime are interrupts, and threads which disable kernel
preemption for short periods (they're not higher priority, but
we can't kick them off cpu until they reenable preemption).
Device interrupts are supposed to be very short and so should
not delay realtime dispatch beyond bounds;  but of course if
you have some badly behaved driver it may do otherwise.
Similarly, if someone disables preemption for longer than they
should (as memscrub does) they can interfere.

On a multiprocessor you dedicate cpu resource and shelter that
from device interrupts (not always possible on x86 architecture)
to isolate your realtime processes.  Of course to call them
realtime processes, as opposed to merely processes in the RT
scheduling class, they have a number of obligations to fulfill
such as locking their memory in physical memory, not sharing
data disks with others, avoiding lazy linking etc.

Gavin

- Show quoted text -

Quote:>> When you care about other sharing a cpu with you the approach
>> is to set aside processors in a processor set (psrset(1M)) and
>> set all those no-intr (no device interrupts).  Now only threads
>> that you launch to that processor set (and their children) run
>> there, and this includes the service kernel threads such as
>> fsflush etc.

> Good info for when I can get a multiprocessor system - thanks.

> Dave

 
 
 

What's magic about 7.32 seconds?

Post by Dave Littel » Thu, 06 Nov 2003 13:28:48



> This *may* be bug 4912731 in which the memscrub thread can overstay
> it's welcome on cpu for a moment too long, and can therefore interfere
> with realtime dispatch if realtime has not been isolated to it's own
> CPU set.  This affects both sparc and x86 Solaris (although the history
> to each is different).  A workaround is to reduce the memscrub span
> but let's verify the cause first:

> You can cause the memscrub to exit on a booted x86 system with

> # echo "disable_memscrub/W1" | mdb -kw

> Note that you can't start it again.  You can make the thread never
> start by setting that in /etc/system (of course you then have
> nothing keeping memory fresh).

> Try that and see if the blip disappears.

Hi Gavin,

Thanks very much for replying - I disabled memscrub as you suggested
but the extra 1 mSec chirps appear identical to the earlier ones.

I am curious, though, about the "nothing keeping memory fresh"
observation.  Maybe I'm just not versed enough, but I'm not at all
sure what that means.

- Show quoted text -

Quote:> Alternatively, here's how the memscrub wakeup interval is computed
> on x86:

> Obtain values for the following with the commands shown:

> memscrub_period_sec    # echo "memscrub_period_sec/U" | mdb -k
> memscrub_span_pages    # echo "memscrub_span_pages/U" | mdb -k
> memscrub_phys_pages    # echo "memscrub_phys_pages/U" | mdb -k

> The first two have default values, the last depends on your installed
> physical memory size.  Now:

> for memscrub_phys_pages <= memscrub_span_pages, use memscrub_period_sec
> otherwise use  memscrub_phys_pages/memscrub_span_pages

> The result gives the interval in seconds of memscrub wakeups
> (provided it isn't falling behind).  Is this the magic 7.32s?

I don't think so:

memscrub_period_sec = 43200
memscrub_span_pages = 1024
memscrub_span_pages = 130955

The numbers don't want to cooperate and cough up 7.32 sec.  ;-)

Quote:

> The vold sounds like a bug, too, if it's interfering with realtime
> dispatch.

Yes, vold definitely made a difference, although I wonder if some
if it might be due to an IDE CDROM and/or a SCSI jaz (neither
with media in place).

Quote:

> On a multiprocessor you dedicate cpu resource and shelter that
> from device interrupts (not always possible on x86 architecture)

Could you please comment on this a bit more?  Why would it not
be possible in some instances?  Back to the HCL we go for
multiprocessor systems?

Quote:> to isolate your realtime processes.  Of course to call them
> realtime processes, as opposed to merely processes in the RT
> scheduling class, they have a number of obligations to fulfill
> such as locking their memory in physical memory, not sharing
> data disks with others, avoiding lazy linking etc.

Well, for this little exercise my test apps aren't locking their
pages and are not statically linked.  I tried a static link, but
the linker complains about not being able to find lrt in a
static form.  That was kinda interesting...

I'd like to think if I was getting paged the timing traces would
be much fuzzier than they are over the timespan of the test
(currently 30 seconds).  I'd also hope that a dynamic link
wouldn't be an issue since these apps are so very small and
everything should be bound and stable after one iteration.
Or am I off base here?

Thanks very much for your help.  I sure hope we can get past
this and move on to the next adventure.

Dave

 
 
 

What's magic about 7.32 seconds?

Post by Gavin Maltb » Thu, 06 Nov 2003 20:21:30


[cut]

Quote:> Thanks very much for replying - I disabled memscrub as you suggested
> but the extra 1 mSec chirps appear identical to the earlier ones.

:-(

Quote:> I am curious, though, about the "nothing keeping memory fresh"
> observation.  Maybe I'm just not versed enough, but I'm not at all
> sure what that means.

DIMMs do obviously keep their cells charged etc so that's not what
I meant by fresh.  The purpose of memscrub is to be sure that all
of physical memory is read over a period in order to trigger any
latent faults.  The idea is that some single bit (correctable through
ECC) errors will creep into memory from time to time (magic moonbeams
and the like).  But current DIMMs don't detect and report this bad ECC
until a read access - at which point the data can be rewritten with
correct ECC.  If we don't force a read periodically then there's
a danger that the single bit error in an ECC word could be struck
again and become a doublebit error (not correctable in current ECC
codes, just detectable).

Quote:>> Alternatively, here's how the memscrub wakeup interval is computed
>> on x86:

>> Obtain values for the following with the commands shown:

>> memscrub_period_sec    # echo "memscrub_period_sec/U" | mdb -k
>> memscrub_span_pages    # echo "memscrub_span_pages/U" | mdb -k
>> memscrub_phys_pages    # echo "memscrub_phys_pages/U" | mdb -k

>> The first two have default values, the last depends on your installed
>> physical memory size.  Now:

>> for memscrub_phys_pages <= memscrub_span_pages, use memscrub_period_sec
>> otherwise use  memscrub_phys_pages/memscrub_span_pages

>> The result gives the interval in seconds of memscrub wakeups
>> (provided it isn't falling behind).  Is this the magic 7.32s?

> I don't think so:

> memscrub_period_sec = 43200
> memscrub_span_pages = 1024
> memscrub_span_pages = 130955

> The numbers don't want to cooperate and cough up 7.32 sec.  ;-)

>> The vold sounds like a bug, too, if it's interfering with realtime
>> dispatch.

> Yes, vold definitely made a difference, although I wonder if some
> if it might be due to an IDE CDROM and/or a SCSI jaz (neither
> with media in place).

>> On a multiprocessor you dedicate cpu resource and shelter that
>> from device interrupts (not always possible on x86 architecture)

> Could you please comment on this a bit more?  Why would it not
> be possible in some instances?  Back to the HCL we go for
> multiprocessor systems?

I don't know the details.  I do know that Solaris cpu offline
code caters for the possibility that on non-sparc platforms
it is not always possible to redirect interrupts away from
a given cpu.  I don't know if that means all x86 systems
or some subset - I don't really know how interrupts work
in x86 land (in Sun systems we can reprogram the IO bridges
to redirect their device interrupts for the pci/sbus slots
they offer to go to different cpus).

If you have an MP system choose a cpu to offline and have a
look at it's offline state.  The following is for cpu 19
(decimal) on a SF6800:

# psradm -f 19
# echo "0t19::cpuinfo -v" | mdb -k
  ID ADDR        FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD      PROC
  19 3000cc12000  2f    0    0  -1   no    no t-10798 2a100663d20 (idle)
                   |
        RUNNING <--+
          READY
       QUIESCED
         EXISTS
        OFFLINE

It is both OFFLINE and QUIESCED which means we managed to redirect
the interrupts it was servicing.  On some x86 systems it is possible
to be OFFLINE without being QUIESCED.

Quote:>> to isolate your realtime processes.  Of course to call them
>> realtime processes, as opposed to merely processes in the RT
>> scheduling class, they have a number of obligations to fulfill
>> such as locking their memory in physical memory, not sharing
>> data disks with others, avoiding lazy linking etc.

> Well, for this little exercise my test apps aren't locking their
> pages and are not statically linked.  I tried a static link, but
> the linker complains about not being able to find lrt in a
> static form.  That was kinda interesting...

Static linking is officially evil - see countless past religious
arguments on this.  But you don't need static linking - you need
to avoid lazy linking.  See manpage for ld.so.1 for LD_BIND_NOW.

Quote:> I'd like to think if I was getting paged the timing traces would
> be much fuzzier than they are over the timespan of the test
> (currently 30 seconds).  I'd also hope that a dynamic link
> wouldn't be an issue since these apps are so very small and
> everything should be bound and stable after one iteration.
> Or am I off base here?

Sounds ok.

Quote:> Thanks very much for your help.  I sure hope we can get past
> this and move on to the next adventure.

You might have success through enabling all TNF probes and seeing
if they reveal the periodic activity you're looking for.  See
prex(1).  In the next version of Solaris we have some functionality
that would let you pinpoint this activity really quickly ...

Gavin

 
 
 

What's magic about 7.32 seconds?

Post by Dave Littel » Fri, 07 Nov 2003 13:21:56


I really hope you're on some island in the Carribean - otherwise
you get up way too early, man!

Gavin Maltby wrote (at 5:20 in the freakin' morning!):
 >

Quote:> :-(

But wait!  I knew I had recently turned something on, and it turned
out to be xntpd!  The pig is sneaking in there an stealing a millisecond
every 7.32 seconds.  I've been told that at least some portion of
xntpd runs in the real-time class, so maybe there's some untoward
behavior in it that warrants a bug being filed?

I noticed also that a kill -9 of xntpd didn't rid the system of the
timing chirps.  I had to move xntpd's config file and reboot to
clear the thing out for real.  Interesting.  Would a -TERM have
been a better choice?

[deletia]

Quote:

>> I'd like to think if I was getting paged the timing traces would
>> be much fuzzier than they are over the timespan of the test
>> (currently 30 seconds).  I'd also hope that a dynamic link
>> wouldn't be an issue since these apps are so very small and
>> everything should be bound and stable after one iteration.
>> Or am I off base here?

> Sounds ok.

I incorporated an mlockall() to both processes and started them
with LD_BIND_NOW = 1 and it seems to have made a noticeable,
but not a massive, improvement in the timings.

Quote:>> Thanks very much for your help.  I sure hope we can get past
>> this and move on to the next adventure.

> You might have success through enabling all TNF probes and seeing
> if they reveal the periodic activity you're looking for.  See
> prex(1).  In the next version of Solaris we have some functionality
> that would let you pinpoint this activity really quickly ...

Cool.

Thanks again for all your help.  It taught me some good things I
needed to know and got me back on track.

Dave

 
 
 

What's magic about 7.32 seconds?

Post by Logan Sha » Fri, 07 Nov 2003 16:13:29



> I noticed also that a kill -9 of xntpd didn't rid the system of the
> timing chirps.  I had to move xntpd's config file and reboot to
> clear the thing out for real.  Interesting.  Would a -TERM have
> been a better choice?

A -TERM is *always* a better choice than a -9, unless you've tried
the -TERM and everything else, and none of it worked.  -9 is just
basically asking for trouble.  When I was a system admin, I'd get
a call every few weeks from someone that decided to kill -9 their
Xsun, and then they were surprised that they couldn't type on
the console and a bunch of garbage was coming up whenever they hit
a key.  The SIGKILL had forced Xsun to exit without putting the
keyboard back in ascii mode, and the Sun console expects ascii
keyboard events and doesn't do well when it receives raw keyboard
events instead.  These same people would put "kbd_mode -a" at
the end of their .xinitrc in hopes of fixing the problem, when
the real problem was that they forced the system to switch states
but also forced it not to do the necessary stuff to make the
switch happen properly.

This could turn into a mini-rant, but the point is that on Unix,
*most* resource allocation is done at the process level and thus
a -9 works OK much of the time, but there are still resources
that don't and can't get deallocated automatically by the kernel.
Most of the time, when a -9 is used, a -TERM would have sufficed
just fine.  There are exceptions, but they really are pretty
rare, and it's not hard in those few cases to just do regular
"kill" followed by "kill -9".  It's really, really easy if you
know how to use shell command histories.

By the way, as I recall/understand, ntpd needs to run periodically
like that to make the appropriate adjustments to your clock, so
that the clock stays very close to real time.  I'm not sure, but
I expect that the 7.32 seconds is a function of you hardware clock's
actual frequency.

Also, I wonder if the microsecond isn't actually being used by
ntpd, but instead ntpd is simply adjusting the time by about
a microsecond at those intervals.  That could also account for
discrepancies in the time that events happen, if the time is
measured only by the system's internal clock and not some external
time source.  Of course, it may not matter, and it's just an
academic question really...

   - Logan

 
 
 

What's magic about 7.32 seconds?

Post by John Howell » Fri, 07 Nov 2003 19:16:55



> But wait!  I knew I had recently turned something on, and it turned
> out to be xntpd!  The pig is sneaking in there an stealing a millisecond
> every 7.32 seconds.  I've been told that at least some portion of
> xntpd runs in the real-time class, so maybe there's some untoward
> behavior in it that warrants a bug being filed?

I doubt it. It is probably not even ntpd that is running every 7.32
seconds but the kernel. ntp will have called the ntp_adjtime function
("man ntp_adjtime") to tell the kernel what adjustment to make to the
system clock behaviour. The kernel will implement the required
correction independently of the ntp daemon, and the exact way the kernel
performs the time corrections differs on various Sun platforms. On a T1
AC200 we found it caused the time to appear to step about 30mS forward
every two minutes. In your case it appears to be making finer
corrections more often. The bulk of the lost millisecond is probably
only virtual, as nothing is processing for that length of time. Rather,
the kernal changes the applications view of the time by skipping the
clock forward.

Quote:> I noticed also that a kill -9 of xntpd didn't rid the system of the
> timing chirps.  I had to move xntpd's config file and reboot to
> clear the thing out for real.  Interesting.

Which confirms what I noted above, that once the ntp daemon has told the
kernel what corrections to make those corrections carry on irrespective
of whether the ntp daemon is running.

Quote:> Would a -TERM have
> been a better choice?

Probably not. The kernel will carry on making the corrections defined by
the last call to ntp_adjtime until the system is rebooted. However you
terminate the ntp daemon, kill -9, kill -TERM or kill -HUP, it will not
make a call to ntp_adjtime to cancel the clock adjustments.

John Howells

 
 
 

What's magic about 7.32 seconds?

Post by Beard » Sat, 08 Nov 2003 00:00:14


[Excised]

Quote:> The idea is that some single bit (correctable through
> ECC) errors will creep into memory from time to time (magic moonbeams
> and the like).  But current DIMMs don't detect and report this bad ECC

Magic mushrooms, more likely...

[Excised again]

Quote:> You might have success through enabling all TNF probes and seeing
> if they reveal the periodic activity you're looking for.  See
> prex(1).  In the next version of Solaris we have some functionality
> that would let you pinpoint this activity really quickly ...

> Gavin

Gavin, you obviously have a keyboard shortcut to generate the text "In
the next version of Solaris " ;^) Tempting us firstly with dtrace, and
now the ability to "pinpoint this activity really quickly". Come clean,
and let us know the URL of where we can find fuller details of the funky
stuff in "the next version". We are salivating... well I am, anyways ;^)
 
 
 

What's magic about 7.32 seconds?

Post by Barry Margoli » Sat, 08 Nov 2003 00:51:24




Quote:>Also, I wonder if the microsecond isn't actually being used by
>ntpd, but instead ntpd is simply adjusting the time by about
>a microsecond at those intervals.

It would also explain why the problem persisted after ntpd was killed.
When ntpd notices that the clock is wrong, it calls adjtime(), which tells
the kernel to slow down or speed up the clock until the adjustment is
reached.  ntpd only has to call this once, it then continues
automatically.  The kernel can't speed up the real-time clock, it's just a
crystal oscillator of some kind, so it presumably accomplishes the task by
stealing or adding a microsecond after every few ticks.

--

Level(3), Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

 
 
 

What's magic about 7.32 seconds?

Post by Richard L. Hamilt » Sat, 08 Nov 2003 09:47:24




Quote:> I really hope you're on some island in the Carribean - otherwise
> you get up way too early, man!

> Gavin Maltby wrote (at 5:20 in the freakin' morning!):

>> :-(

> But wait!  I knew I had recently turned something on, and it turned
> out to be xntpd!  The pig is sneaking in there an stealing a millisecond
> every 7.32 seconds.  I've been told that at least some portion of
> xntpd runs in the real-time class, so maybe there's some untoward
> behavior in it that warrants a bug being filed?

> I noticed also that a kill -9 of xntpd didn't rid the system of the
> timing chirps.  I had to move xntpd's config file and reboot to
> clear the thing out for real.  Interesting.  Would a -TERM have
> been a better choice?

[...]

Only explanation I can think of is a side effect of an adjtime(2)
adjustment by xntpd that was still taking place even after xntpd was
killed, in which case -TERM would only be better if xntpd caught it and
cancelled the adjtime() call with another one asking for a 0 adjustment.

--

 
 
 

What's magic about 7.32 seconds?

Post by Frank Cusac » Sat, 08 Nov 2003 14:04:24



Quote:> Only explanation I can think of is a side effect of an adjtime(2)
> adjustment by xntpd that was still taking place even after xntpd was
> killed, in which case -TERM would only be better if xntpd caught it and
> cancelled the adjtime() call with another one asking for a 0 adjustment.

Yeah, it would be really interesting to see the result of kill -TERM,
if the OP has the time to try this.

/fc