desc v0.61 found a 2.5 kernel bug

desc v0.61 found a 2.5 kernel bug

Post by Chuck Ebber » Thu, 01 May 2003 04:40:10




>>  And shouldn't CR3 be intitialized in case anyone actually wants to
>> switch back to the kernel TSS?

> For now no, since the only task gate ever taken (double fault), never
> returns (you don't want to update the TSS's CR3 field on every
> switch_to() so you would have to do it in the task gate return
> path, as well as having a correct LDT field).

  I want to write a TSS-based debug exception handler that just does
an iret when it gets invoked.  For now it looks easier to just keep
CR3 up-to-date on every switch.

Quote:> However, returning from a task gate is so much fraught with races wrt
> segment registers that the best thing to do is to avoid it.

 Even with interrupts off?

------
 Chuck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

desc v0.61 found a 2.5 kernel bug

Post by Gabriel Pauber » Thu, 01 May 2003 19:30:16




> >>  And shouldn't CR3 be intitialized in case anyone actually wants to
> >> switch back to the kernel TSS?

> > For now no, since the only task gate ever taken (double fault), never
> > returns (you don't want to update the TSS's CR3 field on every
> > switch_to() so you would have to do it in the task gate return
> > path, as well as having a correct LDT field).

>   I want to write a TSS-based debug exception handler that just does
> an iret when it gets invoked.  For now it looks easier to just keep
> CR3 up-to-date on every switch.

It seems cr3 is in the same cache line as esp0 for a 32 byte cache line,
so it's not that big a deal, but I'd still try to avoid this.

Quote:

> > However, returning from a task gate is so much fraught with races wrt
> > segment registers that the best thing to do is to avoid it.

>  Even with interrupts off?

Yes. Consider the following:

        create an LDT entry
        load the segment to %fs
        clear the LDT entry (or mark it non present),
                -> %fs is now stale but still marked valid
        ...(no task switch)
        Interrupt handled through task gate
                -> stale selector written to TSS
        ...(interrupt handler)
        iret-> TS/NP/SF exception when loading segments in the
                new task (I believe it can't be GP)

Of course on an SMP machine with shared LDT, there are even more
ways of triggering segment related exceptions.

Currently %fs and %gs are lazily cleaned up when switching processes
using the standard fixup mechanism, %ds and %es are cleaned up if
necessary when popping them off the stack in the return to user
mode path (the one which ends up in iret). There is no way to recover
from bad user %cs/%ss, the process simply exits in the iret fixup.

But this works only because you can put a specific fixup for each
instruction which loads a given segment register (or two for iret).
In an iret from a task gate, you don't have this fine grained control
(all registers are loaded at once and then checked one by one)
and the return address is unpredictable, so the fixup mechanism is out.

This does not mean that there is no way to safely return from an
interrupt handled through a task gate, but it's not simple (you
don't want to change the existing lazy cleanup mechanism which is
about as simple and low overhead as it gets for the common cases).

        Gabriel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

desc v0.61 found a 2.5 kernel bug

Post by Chuck Ebber » Thu, 01 May 2003 22:20:08



>>   I want to write a TSS-based debug exception handler that just does
>> an iret when it gets invoked.  For now it looks easier to just keep
>> CR3 up-to-date on every switch.

> It seems cr3 is in the same cache line as esp0 for a 32 byte cache line,
> so it's not that big a deal, but I'd still try to avoid this.

 There's no easy way of fixing this up in the handler, so that's the plan
for now.  It also puts more info in the TSS dump right away.

.> Currently %fs and %gs are lazily cleaned up when switching processes
.> using the standard fixup mechanism, %ds and %es are cleaned up if
.> necessary when popping them off the stack in the return to user
.> mode path (the one which ends up in iret). There is no way to recover
.> from bad user %cs/%ss, the process simply exits in the iret fixup.

 Looks like the only clean way is to follow the TSS back link and manually
validate the segment registers before returning:

  invalid FS,GS -> 0
     "    DS,ES -> __USER_DS
          CS,SS -> panic?

 Bad things can happen if a debug fault happens in certain places... for now
the solution is to only support int3 breakpoints and avoid those places.

 Given the above, I hope to be able to put int3 instructions in either
kernel or user code and get snapshots of CPU state in the kernel TSS.
------
 Chuck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

desc v0.61 found a 2.5 kernel bug

Post by pauber » Sat, 10 May 2003 03:00:26


[Sorry for the delay I've been extremely busy on other things]

Quote:

>  Looks like the only clean way is to follow the TSS back link and manually
> validate the segment registers before returning:

>   invalid FS,GS -> 0
>      "    DS,ES -> __USER_DS
>           CS,SS -> panic?

It's still racy on SMP if a thread with the same MM is modifying the LDT
between the time you check whether the selectors are valid and the iret
instruction restoring the previous stack.

Quote:

>  Bad things can happen if a debug fault happens in certain places... for now
> the solution is to only support int3 breakpoints and avoid those places.

Can you elaborate a bit, in which places?

Quote:

>  Given the above, I hope to be able to put int3 instructions in either
> kernel or user code and get snapshots of CPU state in the kernel TSS.

And what about the little bit called TS in CR0 which is always set by
a task switch. That's one bit of state which will be always set when
the debug interrupt returns, and the current code for FPU will be
confused by this AFAICT. Things become even more interesting if you
want to allow debug traps between in the kernel routines using the
FPU, between kernel_fpu_begin() and kernel_fpu_end().

        Gabriel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

desc v0.61 found a 2.5 kernel bug

Post by Chuck Ebber » Sat, 10 May 2003 12:00:26



>>   invalid FS,GS -> 0
>>      "    DS,ES -> __USER_DS
>>           CS,SS -> panic?

> It's still racy on SMP if a thread with the same MM is modifying the LDT
> between the time you check whether the selectors are valid and the iret
> instruction restoring the previous stack.

 Probably nothing can be done about that, either.  Handling invalid segment
with another hardware task doesn't help since the trap occurs in the context
of the new task and there's no way to tell what happened by then.

Quote:

>>  Bad things can happen if a debug fault happens in certain places... for now
>> the solution is to only support int3 breakpoints and avoid those places.

> Can you elaborate a bit, in which places?

 I never even implemented the above checks; there is just a comment in the code
where they belong. It ran for five days that way, then generated a string
of segfaults while trying to shut down.

Quote:

>>  Given the above, I hope to be able to put int3 instructions in either
>> kernel or user code and get snapshots of CPU state in the kernel TSS.

> And what about the little bit called TS in CR0 which is always set by
> a task switch.

 Forgot all about that one.  Maybe pushing cs:eip and flags onto the kernel's
stack and returning to an iret in the kernel task would work?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
 
 
 

1. desc v0.61 found a 2.5 kernel bug

 And this is the way to do it right, but...

 I finally realized the TS problem is basically unsolvable.  There is no
way to know what the value was before a switch happened.

 (BTW some other Free kernel has interesting things in its descriptor
tables: DPL 1 execute-only code segments, conforming code, expand-down
data, multiple LDTs etc...  It uncovered a bug in my code, too.)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2. How to set KDE passwords

3. 2.4.2-ac18 ipx compile fix

4. contest v0.61 benchmark

5. Security problems, help!

6. contest v0.61 on OSDL STP

7. Clueless n00b: usb adsl modem to work? Help needed

8. desc.c v0.62

9. BUG in fvwm v0.93, get v0.94

10. desc.c v0.60 -- print i386 CPU descriptor tables

11. no screens found, errno 61

12. ipv4 /proc/net/route bug in 2.4 and 2.5 kernels