O(1) J9 scheduler: set_cpus_allowed

O(1) J9 scheduler: set_cpus_allowed

Post by Erich Foch » Sun, 03 Feb 2002 01:20:10



Do I understand correctly that there is no clean way right now to change
cpus_allowed of a task (except for current)? In the old scheduler it was
enough to set cpus_allowed and need_resched=1...


> > the function set_cpus_allowed(task_t *p, unsigned long new_mask) works
> > "as is" only if called for the task p=current. The appended patch
> > corrects this and enables e.g. external load balancers to change the
> > cpus_allowed mask of an arbitrary process.

> your patch does not solve the problem, the situation is more complex. What
> happens if the target task is not 'current' and is running on some other
> CPU? If we send the migration interrupt then nothing guarantees that the
> task will reschedule anytime soon, so the target CPU will keep spinning
> indefinitely. There are other problems too, like crossing calls to
> set_cpus_allowed(), etc. Right now set_cpus_allowed() can only be used for
> the current task, and must be used by kernel code that knows what it does.

I understand your point about crossing calls. Besides this I still think
that set_cpus_allowed() does the job, after all wait_task_inactive() has
to return sometime. But of course, the target CPU shouldn't spend this
time in an interrupt routine.

What if you just 'remember' the task you want to take over on the target
CPU and do the real work later in sched_tick()? After all the task to be
moved has the state TASK_UNINTERRUPTIBLE and should be dequeued soon. In
sched_tick() then one would check whether the array field is NULL instead
of wait_task_inactive()...

Quote:> > BTW: how about migrating the definition of the structures runqueue and
> > prio_array into include/linux/sched.h and exporting the symbol
> > runqueues? It would help with debugging, monitoring and other
> > developments.

> for debugging you can export it temporarily. Otherwise i consider it a
> feature that no scheduler internals are visible externally in any way.

Hmmm, some things could be useful to see from outside. rq->nr_running, for
example.

Regards,

Erich

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

O(1) J9 scheduler: set_cpus_allowed

Post by Ingo Molna » Mon, 04 Feb 2002 21:10:09



> Do I understand correctly that there is no clean way right now to
> change cpus_allowed of a task (except for current)? In the old
> scheduler it was enough to set cpus_allowed and need_resched=1...

well, there is a way, by fixing the current mechanizm. But since nothing
uses it currently it wont get much testing. I only pointed out that the
patch does not solve some of the races.

        Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

1. Test results of context-switch under O(1) J9 scheduler

The O(1) J9 scheduler with changes to generic context-switch definitely
provides the best context-switch latency time based on the lat_ctx test
from LMBench benchmark. Test hardware was a 2-way SMP system with 512MB
memory. The results obtained from a 2.4.17 SMP kernel built with O(1)J2,
O(1)J4 and O(1)J9 versions, compared against 2.2.19 base and 2.4.17 base.
The data are time in microsecs. In almost all test cases, context switches
under O(1)J9 seem much faster than 2.4.17 base, and as fast as 2.2.19 (or
better at heavy load). The 2.2.19 measurements were used as a reference
point for 2.2.x.

                                    Ratio of
      Base  Base  O(1)J2      O(1)J4      O(1)J9      O(1)J9 /
Kernel      2.2.19      2.4.17      2.4.17      2.4.17      2.4.17
2417Base

lat_ctx -s 0 2 4 8 16 32 64
2     1.46  3.03  7.38  7.38  0.97  32.0%
4     2.04  3.55  4.84  4.89  4.97  140.2%
8     2.74  4.41  6.04  4.96  4.43  100.6%
16    3.01  4.78  4.93  5.46  3.82  80.0%
32    5.48  7.29  4.80  5.75  4.56  62.6%
64    5.74  8.37  5.86  6.12  5.62  67.1%

lat_ctx -s 16 2 4 8 16 32 64
2     14.33 16.15 15.62 15.11 13.93 86.3%
4     14.30 16.13 17.83 16.27 16.51 102.4%
8     14.39 16.38 17.58 17.35 15.20 92.8%
16    16.43 19.34 17.75 17.70 16.54 85.6%
32    39.92 39.93 24.63 24.95 32.65 81.8%
64    53.67 49.87 45.34 42.76 49.91 100.1%

lat_ctx -s 32 2 4 8 16 32 64
2     22.86 24.73 27.56 27.03 24.29 98.2%
4     22.85 24.85 25.98 25.78 25.22 101.5%
8     25.18 30.94 26.51 26.65 25.04 81.0%
16    58.70 74.32 38.69 35.07 48.32 65.0%
32    99.16 94.41 74.32 74.85 89.62 94.9%
64    99.28 96.91 98.45 97.75 94.29 97.3%

lat_ctx -s 64 2 4 8 16 32 64
2     40.24 42.06 44.57 44.11 40.72 96.8%
4     49.10 43.28 43.23 43.99 42.85 99.0%
8     111.28      105.56      49.43 45.81 58.41 55.3%
16    185.12      182.84      127.11      124.74      169.94      92.9%
32    185.20      182.81      184.97      182.92      175.70      96.1%
64    185.26      184.46      186.54      186.22      178.63      96.8%

Duc.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2. noatun

3. Scheduler Bug (set_cpus_allowed)

4. Kppp & setting speaker volume forever...

5. O(1) scheduler set_cpus_allowed for non-current tasks

6. Most recent version of the linux kernel

7. CFQ disk scheduler (was Re: [PATCH] SFQ disk scheduler)

8. pls help with basic network setup

9. O(1) Scheduler from Ingo vs. O(1) Scheduler from Robert

10. Schedtune Tuneable Parameter patch on O(1) 2.5.3pre5 J9

11. [2.5] set_cpus_allowed needs cpu_online_map BUG check

12. context-switch under 2.4.18pre7aa2 and 2.4.18pre7+O(1)J9

13. set_cpus_allowed() optimization