Allocating processes to processors/processor sets

Allocating processes to processors/processor sets

Post by Nanda Kisho » Fri, 13 Feb 2004 08:09:06



Hello,

I'm running a 24-job batch (actually 120 in all, in five cycles) on a
24-CPU Sun Fire 6800 box with 48G RAM, and the CPU Usage is more than
90% when there are 24 jobs running. Since these jobs are extremely CPU
intensive, would it help to allocate these jobs individually to CPU's
or groups of these to processor sets? I'm thinking about what kind of
saving could be made out of saving switches for time sharing. I should
mention however, that these processes also do significant I/O,
although we have seen extremely encouraging (I'm told) response times
(ave 1.6 milliseconds or so). Would the gains made by avoiding the
switches be offset by increased idle times? If anyone has specific
experience in this regard, I would appreciate their inputs (any
response is welcome, though).

Thanks,
Nanda Kishore

 
 
 

Allocating processes to processors/processor sets

Post by Darren Dunha » Fri, 13 Feb 2004 08:48:00



> Hello,
> I'm running a 24-job batch (actually 120 in all, in five cycles) on a
> 24-CPU Sun Fire 6800 box with 48G RAM, and the CPU Usage is more than
> 90% when there are 24 jobs running. Since these jobs are extremely CPU
> intensive, would it help to allocate these jobs individually to CPU's
> or groups of these to processor sets? I'm thinking about what kind of
> saving could be made out of saving switches for time sharing. I should
> mention however, that these processes also do significant I/O,
> although we have seen extremely encouraging (I'm told) response times
> (ave 1.6 milliseconds or so). Would the gains made by avoiding the
> switches be offset by increased idle times? If anyone has specific
> experience in this regard, I would appreciate their inputs (any
> response is welcome, though).

Just off the top of my head, I'd assume no.  The scheduler is already
going to try to keep things local to some extent.

If this machine is really going to just be a huge batch server for CPU
intensive tasks, I might consider testing a separate set of scheduler
parameters with dispadmin to have larger time quanta.

Without testing, my assumption is that there wouldn't be a significant
increase though.

I'm certainly interested in what you find, though.  I don't have such a
box at my disposal to test in that manner.

--

Senior Technical Consultant         TAOS            http://www.taos.com/
Got some Dr Pepper?                           San Francisco, CA bay area
         < This line left intentionally blank to confuse you. >

 
 
 

Allocating processes to processors/processor sets

Post by Gavin Maltb » Fri, 13 Feb 2004 10:14:00




>>Hello,

>>I'm running a 24-job batch (actually 120 in all, in five cycles) on a
>>24-CPU Sun Fire 6800 box with 48G RAM, and the CPU Usage is more than
>>90% when there are 24 jobs running. Since these jobs are extremely CPU
>>intensive, would it help to allocate these jobs individually to CPU's
>>or groups of these to processor sets? I'm thinking about what kind of
>>saving could be made out of saving switches for time sharing. I should
>>mention however, that these processes also do significant I/O,
>>although we have seen extremely encouraging (I'm told) response times
>>(ave 1.6 milliseconds or so). Would the gains made by avoiding the
>>switches be offset by increased idle times? If anyone has specific
>>experience in this regard, I would appreciate their inputs (any
>>response is welcome, though).

> Just off the top of my head, I'd assume no.  The scheduler is already
> going to try to keep things local to some extent.

A thread has affinity to the cpu it last ran on if it ran "recently".
Otherwise a new one is selected according to various rules.  In Solaris 9
(and 8 too after some KU, I *think*) there is also knowledge of
latency groups in a system like the SF6800 (the 4 cpus on a single
SB form a single latency group).  A thread is allocated a home
latency group and an effort is made to try and perform some of
it's memory allocations from that latency group.  Threads then
prefer to run in their home latency group, unless that's very busy
and some other group looks idle.

These general rules (much simplified above) cope reasonably well with
disparate workloads.  It's possible that if you have a very predictable
and static workload that you assign cpus better through pbind or
psrset.

The mpstat output has information on voluntary and involuntary
context switches (csw and icsw).  The former are typically where
you block for IO etc.  The latter are where you are forced off
in favour of someone more deserving.  You can't really see
cpu migrations of a thread without something like Dtrace, but
you can 'prstat 0' and watch which cpus a process runs on and
see how much it moves around.

[cut]

Gavin

 
 
 

1. Allocating processor time to processes

I've been compiling a lot of really big programs (CLAPACK, Octave...)
lately on my PPro 200, Ive noticed in xload that the load heads up
through the first red line then sometimes through a second! red
line. So the question I have is twofold:

a) what does the "load" mean, and when is it fully using the processor
for processes vice waiting loops is the kernel?

b) can I tell the os to concentrate on one task (I know the first
thing to do is kill X), not to the exclusion of all others but just
more?

Craig

2. List of supported PCI 802.11b cards

3. Q: 1 processor vs. 2 processors

4. Intel Driver e1000e-0.5.11.2... compiles and loads fine, but can't setup ethX...?

5. Best Dual Processor board and processor

6. freezes

7. dual processor vs faster single processor

8. Help on saving workspace.

9. How to tell a v8 processor from a v7 processor.

10. Dual processor to single processor!!

11. Help required to run application on a single processor on a multi processor machine

12. How to get utilization info PER PROCESSOR on a multi-processors

13. run a job on both processors of a bi-processor xeon