I've got one that had just about has me beat.
We've got an in-house developed application running on a L3000 with Quad
500MHz 8600 processors that just can't get out of it's own way. The
application itself was developed in a toolset called CellWorks and is used
to control / manage manufacturing equipment on one of our production lines.
Here's the lowdown on what I'm seeing:
First, the system has 1000Base-SX network controllers in it and is seeing on
average ~400-600 pps. I would not consider this a heavy load (we've been
able to easily sustain 10X this level during file transfer operations).
Second, the system has 2GB of memory of which only 1GB is actually in use.
Dynamic buffer cache is set to 5%-20%. IE: there's plenty of memory. Third,
physical IO to disk is very light this was verified using both the tried and
true method of watching the activity lights on the drives themselves and we
also had the opportunity to load this entire application onto a solid state
disk drive and saw no noticeable performance improvement.
Incase you were wondering, this app uses a very small (<25MB) memory
resident database to keep track of where production units are on a
manufacturing line. As a unit moves through the line each piece of
equipment sends a message up to the app telling it in effect I've got the
unit #????? can I proceed, the app looks into it's database and if
everything is aok says yeah go ahead. This is not rocket science....
However, we're seeing 4,8,10 second delays in responding back to equipment
requests when things really get cranked up. And all I can find is a bunch
of apps waiting in "GBL_OTHER_IO_WAIT"..(many of the key modules greater the
90%). The CPU's are running roughly 1x60%, and 3x20% during peak loads, I
can't find a single process the being priority suspended. We've already got
PRM running on this system. I'm using one of the tuned kernel configuration
parameter sets and have only fiddled with maxdsiz_64bit and maxssix_64bit as
per HP's recommendation (which also had no impact).
Does any one have any thoughts on just what exactly is GBL_OTHER_IO_WAIT, or
where I could look next?? Or would you agree that we've thrown enough
hardware at this baby and the reality is it's just not gonna run any better
Any relevant thoughts would be greatly appreciated....