limits on SCHED_FIFO tasks

limits on SCHED_FIFO tasks

Post by rm » Thu, 20 Mar 2003 02:00:13



Hi,

        I've included a preliminary proof-of-concept patch to
2.4.20(+ll) which allows the superuser to set a limit using sysctl's
on the number of cpu cycles SCHED_FIFO tasks may use.  (right now,
uniprocessor only (no APIC), and doesn't handle rollover).

    rt_period_reserved is the number of jiffies out of every
rt_period_length jiffies which are available to SCHED_FIFO tasks.  so
for example

    rt_period_length = 50
    rt_period_reserved = 25

allows SCHED_FIFO tasks to use half of all available ticks during a 50
tick period.  setting rt_period_reserved = 50, would allow the current
behaviour.

the rationale for this approach is that in audio applications (for
example), low-latency real-time performance is desired.  this in turn
means small audio buffers and tight timing constraints to guarantee
glitch-free audio.  lately, SCHED_FIFO (with low-latency, and
preemption patches) has been successful used to do this, but one huge
downside is that if there is a bug in the SCHED_FIFO task, it is very
easy to completely hang the box. since programmers aren't going to
suddenly start writing perfect code, this is what i came up with (it's
similar to what mach's constrained scheduling policy does).  with this
patch i've been able to keep a console (slowly) interactive using
45/50 settings while a SCHED_FIFO task does while(1);.

in the same vein, allowing a limited amount of memory pinning by
non-privileged users is the sort of change which audio folks would
like to see, to make proaudio applications extremely reliable without
compromising the underlying security of the system.

i'm interested in hearing folks' thoughts on this.  (please CC replies).

                          thanks,
                          rob

--- pristine/linux-2.4.20/kernel/sched.c        2003-03-17 23:24:02.000000000 -0500

 unsigned securebits = SECUREBITS_DEFAULT; /* systemwide security settings */

+unsigned long rt_period_start = 0;
+unsigned long rt_period_end = 0;
+unsigned long rt_period_remain = 0;
+unsigned long rt_period_length = 50;
+unsigned long rt_period_reserved = 45;
+
 extern void mem_use(void);


         * runqueue (taking priorities within processes
         * into account).
         */
+
+      
+      
+       /*
+        *   check if we are in the right time period
+        *
+        *   XXX if it burns though it's entire quantum and
+        *       into the next ?
+        *    
+        */
+       if (jiffies >= rt_period_end) {
+         /* no, start over from now */
+         rt_period_start = jiffies;
+         rt_period_end = rt_period_length + rt_period_start;
+         rt_period_remain = rt_period_reserved;
+       }
+      
+       /*
+        *  is there any remaining time ?
+        *  
+        */
+      
+       if (rt_period_remain > 0) {
        weight = 1000 + p->rt_priority;
+       }  else {
+         /* redundent, for clarity */
+         weight = -1;
+       }
+
 out:
        return weight;
 }
--- pristine/linux-2.4.20/kernel/sysctl.c       2003-03-17 23:24:02.000000000 -0500

 extern int core_uses_pid;
 extern int cad_pid;

+
+extern unsigned long rt_period_length;
+extern unsigned long rt_period_reserved;
+
+
 /* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID */
 static int maxolduid = 65535;

        {KERN_LOWLATENCY, "lowlatency", &enable_lowlatency, sizeof (int),
         0644, NULL, &proc_dointvec},
 #endif
+
+       {KERN_FIFOSCHED_PERIOD, "rtsched-period", &rt_period_length,
+          sizeof (int), 0644, NULL, &proc_dointvec},
+       {KERN_FIFOSCHED_RESERV, "rtsched-reserve",
+        &rt_period_reserved, sizeof (int), 0644, NULL, &proc_dointvec},
+
        {0}
 };

--- pristine/linux-2.4.20/include/linux/sysctl.h        2003-03-17 23:24:02.000000000 -0500

        KERN_TAINTED=53,        /* int: various kernel tainted flags */
        KERN_CADPID=54,         /* int: PID of the process to notify on CAD */
        KERN_LOWLATENCY=55,     /* int: enable low latency scheduling */
+       KERN_FIFOSCHED_PERIOD=56, /* max time rt processes can take up */
+       KERN_FIFOSCHED_RESERV=57, /* "" */
 };

--- pristine/linux-2.4.20/kernel/timer.c        2002-11-28 18:53:15.000000000 -0500

 #define NOOF_TVECS (sizeof(tvecs) / sizeof(tvecs[0]))

+extern unsigned long rt_period_start;
+extern unsigned long rt_period_end;  
+extern unsigned long rt_period_remain;
+extern unsigned long rt_period_length;
+extern unsigned long rt_period_reserved;
+
+
+
+
+
 void init_timervecs (void)
 {

                                p->need_resched = 1;
                        }
                }
+              
+               if (p->policy == SCHED_FIFO) {
+                 if (rt_period_remain == 0) {
+                   p->need_resched = 1;
+                 } else {
+                   rt_period_remain--;
+                 }
+               }
+
                if (p->nice > 0)
                        kstat.per_cpu_nice[cpu] += user_tick;
                else
--------------------- end -----------------------------------------

----
Robert Melby
Georgia Institute of Technology, Atlanta Georgia, 30332
uucp:     ...!{decvax,hplabs,ncar,purdue,rutgers}!gatech!prism!gt4255a

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

limits on SCHED_FIFO tasks

Post by george anzinge » Thu, 20 Mar 2003 03:20:12


If the issue is regaining control after some RT task goes into a loop,
the way this is usually done is to keep a session around with a higher
priority.  Using this concept, one might provide tools that, from
userland, insure that such a session exists prior to launching the
"suspect" code.  I fail to see the need for this sort of code in the
kernel.

-g


> Hi,

>    I've included a preliminary proof-of-concept patch to
> 2.4.20(+ll) which allows the superuser to set a limit using sysctl's
> on the number of cpu cycles SCHED_FIFO tasks may use.  (right now,
> uniprocessor only (no APIC), and doesn't handle rollover).

>     rt_period_reserved is the number of jiffies out of every
> rt_period_length jiffies which are available to SCHED_FIFO tasks.  so
> for example

>     rt_period_length = 50
>     rt_period_reserved = 25

> allows SCHED_FIFO tasks to use half of all available ticks during a 50
> tick period.  setting rt_period_reserved = 50, would allow the current
> behaviour.

> the rationale for this approach is that in audio applications (for
> example), low-latency real-time performance is desired.  this in turn
> means small audio buffers and tight timing constraints to guarantee
> glitch-free audio.  lately, SCHED_FIFO (with low-latency, and
> preemption patches) has been successful used to do this, but one huge
> downside is that if there is a bug in the SCHED_FIFO task, it is very
> easy to completely hang the box. since programmers aren't going to
> suddenly start writing perfect code, this is what i came up with (it's
> similar to what mach's constrained scheduling policy does).  with this
> patch i've been able to keep a console (slowly) interactive using
> 45/50 settings while a SCHED_FIFO task does while(1);.

> in the same vein, allowing a limited amount of memory pinning by
> non-privileged users is the sort of change which audio folks would
> like to see, to make proaudio applications extremely reliable without
> compromising the underlying security of the system.

> i'm interested in hearing folks' thoughts on this.  (please CC replies).

>                      thanks,
>                      rob

> --- pristine/linux-2.4.20/kernel/sched.c   2003-03-17 23:24:02.000000000 -0500
> +++ linux/kernel/sched.c   2003-03-18 13:22:38.000000000 -0500

>  unsigned securebits = SECUREBITS_DEFAULT; /* systemwide security settings */

> +unsigned long rt_period_start = 0;
> +unsigned long rt_period_end = 0;
> +unsigned long rt_period_remain = 0;
> +unsigned long rt_period_length = 50;
> +unsigned long rt_period_reserved = 45;
> +
>  extern void mem_use(void);

>  /*

>     * runqueue (taking priorities within processes
>     * into account).
>     */
> +
> +  
> +  
> +  /*
> +   *   check if we are in the right time period
> +   *
> +   *   XXX if it burns though it's entire quantum and
> +   *       into the next ?
> +   *    
> +   */
> +  if (jiffies >= rt_period_end) {
> +    /* no, start over from now */
> +    rt_period_start = jiffies;
> +    rt_period_end = rt_period_length + rt_period_start;
> +    rt_period_remain = rt_period_reserved;
> +  }
> +  
> +  /*
> +   *  is there any remaining time ?
> +   *  
> +   */
> +  
> +  if (rt_period_remain > 0) {
>    weight = 1000 + p->rt_priority;
> +  }  else {
> +    /* redundent, for clarity */
> +    weight = -1;
> +  }
> +
>  out:
>    return weight;
>  }
> --- pristine/linux-2.4.20/kernel/sysctl.c  2003-03-17 23:24:02.000000000 -0500
> +++ linux/kernel/sysctl.c  2003-03-18 13:05:22.000000000 -0500

>  extern int core_uses_pid;
>  extern int cad_pid;

> +
> +extern unsigned long rt_period_length;
> +extern unsigned long rt_period_reserved;
> +
> +
>  /* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID */
>  static int maxolduid = 65535;
>  static int minolduid;

>    {KERN_LOWLATENCY, "lowlatency", &enable_lowlatency, sizeof (int),
>     0644, NULL, &proc_dointvec},
>  #endif
> +
> +  {KERN_FIFOSCHED_PERIOD, "rtsched-period", &rt_period_length,
> +     sizeof (int), 0644, NULL, &proc_dointvec},
> +  {KERN_FIFOSCHED_RESERV, "rtsched-reserve",
> +   &rt_period_reserved, sizeof (int), 0644, NULL, &proc_dointvec},
> +
>    {0}
>  };

> --- pristine/linux-2.4.20/include/linux/sysctl.h   2003-03-17 23:24:02.000000000 -0500
> +++ linux/include/linux/sysctl.h   2003-03-18 13:03:37.000000000 -0500

>    KERN_TAINTED=53,        /* int: various kernel tainted flags */
>    KERN_CADPID=54,         /* int: PID of the process to notify on CAD */
>    KERN_LOWLATENCY=55,     /* int: enable low latency scheduling */
> +  KERN_FIFOSCHED_PERIOD=56, /* max time rt processes can take up */
> +  KERN_FIFOSCHED_RESERV=57, /* "" */
>  };

> --- pristine/linux-2.4.20/kernel/timer.c   2002-11-28 18:53:15.000000000 -0500
> +++ linux/kernel/timer.c   2003-03-18 12:41:05.000000000 -0500

>  #define NOOF_TVECS (sizeof(tvecs) / sizeof(tvecs[0]))

> +extern unsigned long rt_period_start;
> +extern unsigned long rt_period_end;  
> +extern unsigned long rt_period_remain;
> +extern unsigned long rt_period_length;
> +extern unsigned long rt_period_reserved;
> +
> +
> +
> +
> +
>  void init_timervecs (void)
>  {
>    int i;

>                            p->need_resched = 1;
>                    }
>            }
> +          
> +          if (p->policy == SCHED_FIFO) {
> +            if (rt_period_remain == 0) {
> +              p->need_resched = 1;
> +            } else {
> +              rt_period_remain--;
> +            }
> +          }
> +
>            if (p->nice > 0)
>                    kstat.per_cpu_nice[cpu] += user_tick;
>            else
> --------------------- end -----------------------------------------

> ----
> Robert Melby
> Georgia Institute of Technology, Atlanta Georgia, 30332
> uucp:     ...!{decvax,hplabs,ncar,purdue,rutgers}!gatech!prism!gt4255a

> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

--

High-res-timers:  http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

limits on SCHED_FIFO tasks

Post by rm » Thu, 20 Mar 2003 04:50:08



> If the issue is regaining control after some RT task goes into a loop,
> the way this is usually done is to keep a session around with a higher
> priority.  Using this concept, one might provide tools that, from
> userland, insure that such a session exists prior to launching the
> "suspect" code.  I fail to see the need for this sort of code in the
> kernel.

some of the code out there already uses a watchdog process. there are
a couple of issues with it.

first, it's difficult to figure out when another process is behaving
badly. and if another process is behaving badly, the only choice you
have is to kill it. secondly, what happens if the watch dog process
dies or the bug is in that process itself? i once accidently killed
the thread serving as the watchdog and then hung the system trying to
bring down the SCHED_FIFO program afterward.

the glib answer is, 'don't do stupid things'. unfortunately, people do
stupid things, and the system shouldn't be vulnerable to them to the
extent possible. it's like saying, 'preemptive multitasking isn't
necessary because cooperative multitasking works fine and offers lower
overhead...just don't run stupid code with bugs in it'.

it seems like this is more easily and more completely solved by
allowing limitations to be placed on what a sched_fifo task can do, in
the same way that the kernel allows you to put limits on users and
programs.

i'm not arguing against other people using SCHED_FIFO to lockup their
boxes, just that there needs to be something in between running as
SCHED_OTHER, and running with absolutely no limits as SCHED_FIFO.

                thanks,
                 rob

----
Robert Melby
Georgia Institute of Technology, Atlanta Georgia, 30332
uucp:     ...!{decvax,hplabs,ncar,purdue,rutgers}!gatech!prism!gt4255a

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

limits on SCHED_FIFO tasks

Post by Andrew Morto » Thu, 20 Mar 2003 05:10:08



> If the issue is regaining control after some RT task goes into a loop,
> the way this is usually done is to keep a session around with a higher
> priority.  Using this concept, one might provide tools that, from
> userland, insure that such a session exists prior to launching the
> "suspect" code.  I fail to see the need for this sort of code in the
> kernel.

That works, until your shell calls ext3_mark_inode_dirty(), which blocks on
kjournald activity.  kjournald is SCHED_OTHER, and never runs...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

limits on SCHED_FIFO tasks

Post by george anzinge » Thu, 20 Mar 2003 10:30:13




>>If the issue is regaining control after some RT task goes into a loop,
>>the way this is usually done is to keep a session around with a higher
>>priority.  Using this concept, one might provide tools that, from
>>userland, insure that such a session exists prior to launching the
>>"suspect" code.  I fail to see the need for this sort of code in the
>>kernel.

> That works, until your shell calls ext3_mark_inode_dirty(), which blocks on
> kjournald activity.  kjournald is SCHED_OTHER, and never runs...

That is classic priority inversion.  It would be "nice" to find a fix
for that :)  I think that the proposed action should not be triggered
until there is some "notice" that something is wrong.  I suppose it
could be a watchdog timer of some sort.  Still, if the priority
inversion issue were solved, all the rest could be done in user land.

--

High-res-timers:  http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

1. SCHED_FIFO task blocks magic sysrq

It seems like the sysrq code can get starved by a SCHED_FIFO task.  I
learned this by having an accidentally runaway SCHED_FIFO task which
locked up my system.  No SAK, no sync, no unmount, no reboot.  Big Red
Button.

David

--
David Mansfield                                           (718) 963-2020

Ultramaster Group, LLC                               www.ultramaster.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2. config 3C575-tx

3. O(1) scheduler seems to lock up on sched_FIFO and sched_RR tasks

4. Hylafax binaries for SunOS 4.1.3?

5. NR_OPEN ... fs.h / limits.h / tasks.h

6. My Linux box just died!

7. Is there a limit on RT tasks with RTAI and LXRT?

8. double picture in window using ANSYS

9. Task States; can't kill 'D' tasks

10. problem while putting task in task queue!!

11. Ignoring tasks by task bar

12. converting Outlook tasks to StarOffice tasks?

13. SCHED_FIFO and SCHED_RR scheduler fix, kernel 2.4.18