IDE from current bk tree, UDMA and two channels...

IDE from current bk tree, UDMA and two channels...

Post by Petr Vandrove » Wed, 31 Jul 2002 23:10:08



Hi Martin,
  here at work I have i845 chipset, with one UDMA100 disk connected
to the primary channel, and one UDMA100 disk and one CD-DVD on the
secondary one. CD-DVD driver is not loaded at all, all three devices
are configured for UDMA by kernel.

  Today 2.5.29-cset511 died when rebooting to 2.5.29-cset536 (rmap.c:212
BUG(), but I believe that it is fixed by Paulus's page->index patch
(cset520)) and after reboot I'm not able to fsck /dev/hdc1. It dies with

hdc: ide_dma_intr: status=0x58 [ drive ready,seek complete,data request]
hdc: request error, nr. 1

and fsck is D, and channel is stopped :-( First something easy: I think
that we should use ", " as a separator in dump_bits, and if there is
space after opening "[", there should be also space before closing "]".

Second problem is that read operation which ends with
"drive ready, seek complete, data request" (why it happened in first
place?) will just read one sector from drive (it was DMA transfer,
so drive->mult_count == 0), and then it returns from ata_error
with ATA_OP_CONTINUES. But what continues? Drive told us that
current operation is done, and no new operation was started, so
there is very low chance that some IRQ will ever come, and timer was
just removed by ata_irq_request(), so channel will never awake.

And third, why this happens at all? When I instrumented ide_dma_intr
with printk, udma_stop() returns zero: it means that everything went
fine, UDMA engine asked for interrupt, no error, UDMA engine stopped.
Only reason I can invent is that drive did not clear DRQ bit yet, or
that we programmed UDMA engine with too few bytes to transfer. Either
of these explanations looks strange to me, as this does not explain
why it happens only when both channels are in use simultaneously.

And last thing: problem does not happen when only one of channels is
active, it is triggered only when both channels are active, and
channel #1 is always one which dies. Channel #0 uses IRQ14, channel #1
IRQ15, so there should be no sharing involved.
                                Thanks,
                                    Petr Vandrovec

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

IDE from current bk tree, UDMA and two channels...

Post by Marcin Daleck » Wed, 31 Jul 2002 23:40:06



> Hi Martin,
>   here at work I have i845 chipset, with one UDMA100 disk connected
> to the primary channel, and one UDMA100 disk and one CD-DVD on the
> secondary one. CD-DVD driver is not loaded at all, all three devices
> are configured for UDMA by kernel.

>   Today 2.5.29-cset511 died when rebooting to 2.5.29-cset536 (rmap.c:212
> BUG(), but I believe that it is fixed by Paulus's page->index patch
> (cset520)) and after reboot I'm not able to fsck /dev/hdc1. It dies with

> hdc: ide_dma_intr: status=0x58 [ drive ready,seek complete,data request]
> hdc: request error, nr. 1

That is usually indicating that some operation was started before
some other really finished.

Quote:> and fsck is D, and channel is stopped :-( First something easy: I think
> that we should use ", " as a separator in dump_bits, and if there is
> space after opening "[", there should be also space before closing "]".

Yeep. No problem.

Quote:

> Second problem is that read operation which ends with
> "drive ready, seek complete, data request" (why it happened in first
> place?) will just read one sector from drive (it was DMA transfer,
> so drive->mult_count == 0), and then it returns from ata_error
> with ATA_OP_CONTINUES. But what continues? Drive told us that
> current operation is done, and no new operation was started, so
> there is very low chance that some IRQ will ever come, and timer was
> just removed by ata_irq_request(), so channel will never awake.

What should continue is the retry of the operation, since otherwise
it will be abondoned in do_ide_request(). However I will recheck.

Quote:> And third, why this happens at all? When I instrumented ide_dma_intr
> with printk, udma_stop() returns zero: it means that everything went
> fine, UDMA engine asked for interrupt, no error, UDMA engine stopped.
> Only reason I can invent is that drive did not clear DRQ bit yet, or
> that we programmed UDMA engine with too few bytes to transfer. Either
> of these explanations looks strange to me, as this does not explain
> why it happens only when both channels are in use simultaneously.

> And last thing: problem does not happen when only one of channels is
> active, it is triggered only when both channels are active, and
> channel #1 is always one which dies. Channel #0 uses IRQ14, channel #1
> IRQ15, so there should be no sharing involved.

Hmm, the order of channels matters for the way the queues are feed.
I think we could expirence reentrancy problems. Or there
are some errors in ata_irq_handler() in dispatching the incomming
IRQs. It should be a good idea to add an IRQ number parameter to the
IRQ handler type, since this would allow to detect such situtations.

One check that could help would be to discover the drive to serive next,
based on drive->queue in do_ide_request() instead of naively looking
through all drives in do_ide_request(). At least comparing it to the
queue parameter after selection would make sense.

Do you do unmasking of IRQs? Holding them a bit longer could have some
impact as well...

Thanks for the input, I will have to think through it a bit longer :-).

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

IDE from current bk tree, UDMA and two channels...

Post by Petr Vandrove » Thu, 01 Aug 2002 01:20:11



Quote:> > Second problem is that read operation which ends with
> > "drive ready, seek complete, data request" (why it happened in first
> > place?) will just read one sector from drive (it was DMA transfer,
> > so drive->mult_count == 0), and then it returns from ata_error
> > with ATA_OP_CONTINUES. But what continues? Drive told us that
> > current operation is done, and no new operation was started, so
> > there is very low chance that some IRQ will ever come, and timer was
> > just removed by ata_irq_request(), so channel will never awake.

> What should continue is the retry of the operation, since otherwise
> it will be abondoned in do_ide_request(). However I will recheck.

It looks to me like that we only issue idle immediate and reset
to the drive. And even if we reset drive, we do not reissue
command, not even talking about resetting handler. And because of
ide_dma_intr -> ata_error will report ATA_OP_CONTINUES, ata_irq_request
will think that handler reissued command, and it will leave IDE_BUSY set.
So we are left with IDE_BUSY set, idle hardware, no handler and no timer
active, and with one request on the fly lost somewhere in the system.
Probably code which reissued hardware was dropped sometime in the past
changes?

Another problem I found: ata_error calls ata_status_poll, which can
call back to ata_error. Hardwiring BUSY_STAT bit to 1 (== unplugging
drive from system, for example) can cause this loop, as far as I can see.
Fortunately on my system it reads 0x7F from status register after disk
unplug, but it still does not look correct.

Quote:> > And last thing: problem does not happen when only one of channels is
> > active, it is triggered only when both channels are active, and
> > channel #1 is always one which dies. Channel #0 uses IRQ14, channel #1
> > IRQ15, so there should be no sharing involved.

> Do you do unmasking of IRQs? Holding them a bit longer could have some
> impact as well...

It was happening with default configuration, with unmaskirq=1. Now I tried

hdparm -u 0 /dev/hda; hdparm -u 0 /dev/hdc
vmware-config.pl -default & fsck -f /dev/hdc1

and it again died. vmware-config.pl is used as simple compile test,
it happens with 'ls -lRta /' too, but with 'vmware-config.pl' it happens
much faster.

Stack trace when this problem happens is:

ide_dma_intr + b8/cc (here I added printstate() call)
ata_irq_request + 11e/1cc
handle_IRQ_event + 29/4c
do_IRQ + df/190
common_interrupt + 18/20
madvise_willneed + 10/94
radix_tree_lookup + 18/60
do_page_cache_readahead + 92/13c
do_generic_file_read + 57/2a8
generic_file_read + 11c/138
file_read_actor + 0/8c
vfs_read + b4/134
sys_read + 2a/3c
syscall_call + 7/b

It is UP machine (with SMP non-preemptible kernel). Stack trace does not
look like that it was caused by some race.
                                                Best regards,
                                                    Petr Vandrovec

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

IDE from current bk tree, UDMA and two channels...

Post by Petr Vandrove » Thu, 01 Aug 2002 03:30:08




> > > Second problem is that read operation which ends with
> > > "drive ready, seek complete, data request" (why it happened in first
> > > place?) will just read one sector from drive (it was DMA transfer,
> > > so drive->mult_count == 0), and then it returns from ata_error
> > > with ATA_OP_CONTINUES. But what continues? Drive told us that
> > > current operation is done, and no new operation was started, so
> > > there is very low chance that some IRQ will ever come, and timer was
> > > just removed by ata_irq_request(), so channel will never awake.

> > What should continue is the retry of the operation, since otherwise
> > it will be abondoned in do_ide_request(). However I will recheck.

> It is UP machine (with SMP non-preemptible kernel). Stack trace does not
> look like that it was caused by some race.

There is something severely broken... I reenabled
ide: unexpected interrupt in ata_irq_request and to my surprise here
we get one suprious interrupt for each request we do, on both
channels - primary and secondary.

It looked:

udma_pci_init: sending read command to drive
ata_irq_request: IRQ arrived, for us, calling handler
ata_irq_request: handler returned 0
ide: unexpected interrupt 1 15 handler=00000000
callstack: ata_irq_request + 7e/234, handle_IRQ_event + 29/4c,
           do_IRQ + df/190, common_interrupt + 18/20, do_softirq + 50/ac,
           do_IRQ + 179/190, common_interrupt + 18/20
udma_pci_init: sending read command to drive
ata_irq_request: IRQ arrived, for us, calling handler
ata_irq_request: handler returned 0
ide: unexpected interrupt 1 15 handler=00000000
callstack: same as above
udma_pci_init: sending read command to drive
ata_irq_request: IRQ arrived, for us, calling handler
ata_irq_request: handler returned 0
udma_pci_init: sending read command to drive
ata_irq_request: command immediately queued by do_ide_request
ata_irq_request: IRQ arrived, for us, calling handler
oops: ide_dma_intr: udmastatus=00, diskstatus=58

So we are getting one spurious interrupt for each UDMA request.
Until we do not issue new command to the drive immediately, IRQ
is silently ignored, and everybody is happy (?). But when we
queue command immediately by call to do_ide_request in
ata_irq_request, sooner or later spurious interrupt will
arrive with wrong timming, and we'll think that command is
done while it is still in progress.

I see same spurious interrupt problem on primary channel too,
but somehow timming is different with UDMA100, and we always find
command done instead of in progress when spurious interrupt happens.

Unfortunately ATA/ATAPIv7 says that single interrupt is triggered
after command is done and all data transfered, and we do not play
with select bit. But we play with nIEN bit of disk. Do you see
any reason why this should cause spurious interrupt? (system is using
XT-PIC, FYI)
                                        Thanks,
                                            Petr Vandrovec

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

IDE from current bk tree, UDMA and two channels...

Post by Petr Vandrove » Thu, 01 Aug 2002 04:30:15



> Unfortunately ATA/ATAPIv7 says that single interrupt is triggered
> after command is done and all data transfered, and we do not play
> with select bit. But we play with nIEN bit of disk. Do you see
> any reason why this should cause spurious interrupt? (system is using
> XT-PIC, FYI)

OK. As I am using only one device on each channel, I commented
out ata_irq_enable(drive, 1) in ide-disk.c when issuing command,
and removed disabling irq in ide_do_request in ide.c when we
do not issue command to the drive, and spurious interrupts disappeared.
So now I'm getting only half of IRQs for channel 0, and system still
works as before ;-)

Unfortunately, problem is still here: when kernel was in idedisk_do_request
performed on channel 0, IRQ for channel 1 arrived, and this irq found
channel 1 DMA engine ready, but drive had DRQ set... oops. Shortly after
that IRQ for channel 1 arrived again, but as it was unexpected, nothing
happened.

I hope that i845 is not simplex device, but first (unexpected) IRQ arrived
just when channel 0 code wrote new value to its IDE_SELECT_REG register.
Now I even disconnected DVD drive, so it is simple two masters, two
channels configuration, but it still happens.

And as always, something else: ata_error does:

OUT_BYTE(WIN_NOP, ch->ports[IDE_CONTROL_OFFSET])

I'd say that it should use 0x00 instead of WIN_NOP, and also that
comment above OUT_BYTE(0x04, ch->ports[IDE_CONTROL_OFFSET]) is bogus.
Command register is IDE_COMMAND, not IDE_CONTROL ;-)
                                        Best regards,
                                                Petr Vandrovec

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

IDE from current bk tree, UDMA and two channels...

Post by Marcin Daleck » Fri, 02 Aug 2002 18:50:09




>>Unfortunately ATA/ATAPIv7 says that single interrupt is triggered
>>after command is done and all data transfered, and we do not play
>>with select bit. But we play with nIEN bit of disk. Do you see
>>any reason why this should cause spurious interrupt? (system is using
>>XT-PIC, FYI)

> OK. As I am using only one device on each channel, I commented
> out ata_irq_enable(drive, 1) in ide-disk.c when issuing command,
> and removed disabling irq in ide_do_request in ide.c when we
> do not issue command to the drive, and spurious interrupts disappeared.
> So now I'm getting only half of IRQs for channel 0, and system still
> works as before ;-)

Well OK this was my next idea, but apparently you already did the
experient on your own. Thanks for the result. I'm still scratching my
head and I have already observed this before myself.
It's always funny to see what happens when one stops a driver
from deliberately disabling IRQs for eons of jiffies :-).

Quote:> Unfortunately, problem is still here: when kernel was in idedisk_do_request
> performed on channel 0, IRQ for channel 1 arrived, and this irq found
> channel 1 DMA engine ready, but drive had DRQ set... oops. Shortly after
> that IRQ for channel 1 arrived again, but as it was unexpected, nothing
> happened.

> I hope that i845 is not simplex device, but first (unexpected) IRQ arrived
> just when channel 0 code wrote new value to its IDE_SELECT_REG register.
> Now I even disconnected DVD drive, so it is simple two masters, two
> channels configuration, but it still happens.

One idea and one experiment I was already thinking about is
to change do_ide_request to actually *not* select delibreately which
device do handle. (The big for loop found there...)
One can instead search for a device on the channel which is matching
the queue for which do_ide_request() was called.

for (unit = 0; unit < MAX_DEVICES; ++unit) {
   ....
   if (tmp->queue == q) {
         drive = tmp;
        break;
   }

Quote:}

if (!drive)
   BUG();

Just please forget temporarly that there is a mechanism for "sleeping".
It is bogous anyway (doesn give time back to anybody) and the only
consumer of it is ide-cd (easly removed there) and ide-tape.c (don't
care the driver was never usable in 2.5.xx)

Quote:> And as always, something else: ata_error does:

> OUT_BYTE(WIN_NOP, ch->ports[IDE_CONTROL_OFFSET])

> I'd say that it should use 0x00 instead of WIN_NOP, and also tha
> comment above OUT_BYTE(0x04, ch->ports[IDE_CONTROL_OFFSET]) is bogus.
> Command register is IDE_COMMAND, not IDE_CONTROL ;-)

Yes I know already about this I will remove the comment.
(Must have forgotten about it.)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

IDE from current bk tree, UDMA and two channels...

Post by Marcin Daleck » Fri, 02 Aug 2002 18:50:08





>>>>Second problem is that read operation which ends with
>>>>"drive ready, seek complete, data request" (why it happened in first
>>>>place?) will just read one sector from drive (it was DMA transfer,
>>>>so drive->mult_count == 0), and then it returns from ata_error
>>>>with ATA_OP_CONTINUES. But what continues? Drive told us that
>>>>current operation is done, and no new operation was started, so
>>>>there is very low chance that some IRQ will ever come, and timer was
>>>>just removed by ata_irq_request(), so channel will never awake.

>>>What should continue is the retry of the operation, since otherwise
>>>it will be abondoned in do_ide_request(). However I will recheck.

>>It is UP machine (with SMP non-preemptible kernel). Stack trace does not
>>look like that it was caused by some race.

> There is something severely broken... I reenabled
> ide: unexpected interrupt in ata_irq_request and to my surprise here
> we get one suprious interrupt for each request we do, on both
> channels - primary and secondary.

> It looked:

> udma_pci_init: sending read command to drive
> ata_irq_request: IRQ arrived, for us, calling handler
> ata_irq_request: handler returned 0
> ide: unexpected interrupt 1 15 handler=00000000
> callstack: ata_irq_request + 7e/234, handle_IRQ_event + 29/4c,
>            do_IRQ + df/190, common_interrupt + 18/20, do_softirq + 50/ac,
>            do_IRQ + 179/190, common_interrupt + 18/20
> udma_pci_init: sending read command to drive
> ata_irq_request: IRQ arrived, for us, calling handler
> ata_irq_request: handler returned 0
> ide: unexpected interrupt 1 15 handler=00000000
> callstack: same as above
> udma_pci_init: sending read command to drive
> ata_irq_request: IRQ arrived, for us, calling handler
> ata_irq_request: handler returned 0
> udma_pci_init: sending read command to drive
> ata_irq_request: command immediately queued by do_ide_request
> ata_irq_request: IRQ arrived, for us, calling handler
> oops: ide_dma_intr: udmastatus=00, diskstatus=58

> So we are getting one spurious interrupt for each UDMA request.
> Until we do not issue new command to the drive immediately, IRQ
> is silently ignored, and everybody is happy (?). But when we
> queue command immediately by call to do_ide_request in
> ata_irq_request, sooner or later spurious interrupt will
> arrive with wrong timming, and we'll think that command is
> done while it is still in progress.

> I see same spurious interrupt problem on primary channel too,
> but somehow timming is different with UDMA100, and we always find
> command done instead of in progress when spurious interrupt happens.

> Unfortunately ATA/ATAPIv7 says that single interrupt is triggered
> after command is done and all data transfered, and we do not play
> with select bit. But we play with nIEN bit of disk. Do you see
> any reason why this should cause spurious interrupt? (system is using
> XT-PIC, FYI)

What I actually try to do is to maintain the nIEN bit enabled the
times we don't do any transfer to the disk in question.
Precisely to prevent the disk from spewing IRQs at times
when it should not. And yes this bit is acting in a reversed manner.
But I'm sure you already know this.
You could of course try to make the ata_irq_enbale()
function a no-op and see whatever this is changing anything.

(Me: Scratching my head with a puzzled expression on the face...;-)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

IDE from current bk tree, UDMA and two channels...

Post by Jens Axbo » Fri, 02 Aug 2002 19:00:11



> >Unfortunately, problem is still here: when kernel was in idedisk_do_request
> >performed on channel 0, IRQ for channel 1 arrived, and this irq found
> >channel 1 DMA engine ready, but drive had DRQ set... oops. Shortly after
> >that IRQ for channel 1 arrived again, but as it was unexpected, nothing
> >happened.

> >I hope that i845 is not simplex device, but first (unexpected) IRQ arrived
> >just when channel 0 code wrote new value to its IDE_SELECT_REG register.
> >Now I even disconnected DVD drive, so it is simple two masters, two
> >channels configuration, but it still happens.

> One idea and one experiment I was already thinking about is
> to change do_ide_request to actually *not* select delibreately which
> device do handle. (The big for loop found there...)
> One can instead search for a device on the channel which is matching
> the queue for which do_ide_request() was called.

> for (unit = 0; unit < MAX_DEVICES; ++unit) {
>   ....
>   if (tmp->queue == q) {
>         drive = tmp;
>    break;
>   }
> }
> if (!drive)
>   BUG();

hey that sucks :-)

seriously, the better way to do this would be to change the q->queuedata
to be a pointer to drive instead of the channel.

that would work, but I think it would seriously starve the other device
on the same channel.

--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

IDE from current bk tree, UDMA and two channels...

Post by Marcin Daleck » Fri, 02 Aug 2002 19:10:05




>>>Unfortunately, problem is still here: when kernel was in idedisk_do_request
>>>performed on channel 0, IRQ for channel 1 arrived, and this irq found
>>>channel 1 DMA engine ready, but drive had DRQ set... oops. Shortly after
>>>that IRQ for channel 1 arrived again, but as it was unexpected, nothing
>>>happened.

>>>I hope that i845 is not simplex device, but first (unexpected) IRQ arrived
>>>just when channel 0 code wrote new value to its IDE_SELECT_REG register.
>>>Now I even disconnected DVD drive, so it is simple two masters, two
>>>channels configuration, but it still happens.

>>One idea and one experiment I was already thinking about is
>>to change do_ide_request to actually *not* select delibreately which
>>device do handle. (The big for loop found there...)
>>One can instead search for a device on the channel which is matching
>>the queue for which do_ide_request() was called.

>>for (unit = 0; unit < MAX_DEVICES; ++unit) {
>>  ....
>>  if (tmp->queue == q) {
>>        drive = tmp;
>>        break;
>>  }
>>}
>>if (!drive)
>>  BUG();

> hey that sucks :-)

Since IDE 111 not any more...

Quote:> seriously, the better way to do this would be to change the q->queuedata
> to be a pointer to drive instead of the channel.

... becouse this is already *done* there :-).

Quote:> that would work, but I think it would seriously starve the other device
> on the same channel.

We starve anyway, becouse the kernel isn't real time and we can't
guarantee "sleeping" for some maximum time and comming back.
We don't reschedule the kernel during this kind of "sleeping".
And we can't know that a command on the "mate" will not take
extraordinary amounts of time. It's only a problem if mixing travan
tapes with disks on a channel.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

IDE from current bk tree, UDMA and two channels...

Post by Jens Axbo » Fri, 02 Aug 2002 19:10:06



> >hey that sucks :-)

> Since IDE 111 not any more...

Yeah I just saw that 110 was the 'broken' solution, 111 made it right.
Good.

Quote:> >seriously, the better way to do this would be to change the q->queuedata
> >to be a pointer to drive instead of the channel.

> ... becouse this is already *done* there :-).

:-)

Quote:> >that would work, but I think it would seriously starve the other device
> >on the same channel.

> We starve anyway, becouse the kernel isn't real time and we can't
> guarantee "sleeping" for some maximum time and comming back.
> We don't reschedule the kernel during this kind of "sleeping".
> And we can't know that a command on the "mate" will not take
> extraordinary amounts of time. It's only a problem if mixing travan
> tapes with disks on a channel.

I'm thinking about the alternation of the devices so one device can't
starve the other device off the channel.

--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

IDE from current bk tree, UDMA and two channels...

Post by Marcin Daleck » Fri, 02 Aug 2002 19:40:07



>>>that would work, but I think it would seriously starve the other device
>>>on the same channel.

>>We starve anyway, becouse the kernel isn't real time and we can't
>>guarantee "sleeping" for some maximum time and comming back.
>>We don't reschedule the kernel during this kind of "sleeping".
>>And we can't know that a command on the "mate" will not take
>>extraordinary amounts of time. It's only a problem if mixing travan
>>tapes with disks on a channel.

> I'm thinking about the alternation of the devices so one device can't
> starve the other device off the channel.

Ah so you are thinking about two equally powered devices
competing for the channel. Something I would call the "sumo fight"
situation. Well disks didn't use the "sleeping" mechanism at all anyway
and the chances someone would do cp from CD-ROM to CD-ROM are low.

Finally I think that the proper granularity of scheduling requests to
the drive is, well, the request layer. The queue processing layer should
handle this becouse otherwise we would have two "competing" optimization
mechanisms. And there we are indeed able to actually relinquish some CPU
time. If you look at an request processing optimization as a low pass
signal filter it's immediately obvious that the effects of chaining them
can be, well at least "counter intuitive".

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

IDE from current bk tree, UDMA and two channels...

Post by Jens Axbo » Fri, 02 Aug 2002 19:50:07




> >>>that would work, but I think it would seriously starve the other device
> >>>on the same channel.

> >>We starve anyway, becouse the kernel isn't real time and we can't
> >>guarantee "sleeping" for some maximum time and comming back.
> >>We don't reschedule the kernel during this kind of "sleeping".
> >>And we can't know that a command on the "mate" will not take
> >>extraordinary amounts of time. It's only a problem if mixing travan
> >>tapes with disks on a channel.

> >I'm thinking about the alternation of the devices so one device can't
> >starve the other device off the channel.

> Ah so you are thinking about two equally powered devices
> competing for the channel. Something I would call the "sumo fight"
> situation. Well disks didn't use the "sleeping" mechanism at all anyway
> and the chances someone would do cp from CD-ROM to CD-ROM are low.

> Finally I think that the proper granularity of scheduling requests to
> the drive is, well, the request layer. The queue processing layer should
> handle this becouse otherwise we would have two "competing" optimization
> mechanisms. And there we are indeed able to actually relinquish some CPU
> time. If you look at an request processing optimization as a low pass
> signal filter it's immediately obvious that the effects of chaining them
> can be, well at least "counter intuitive".

Actually, I'm thinking of a much simple scenario: basically any two
devices on the same channel, both with pending requests on the queue.
This could be a hard drive and a cd writer, for instance. If you have 60
requests pending for the hard drive, queue gets unplugged, you start the
first one. Correct me if I'm wrong, but now you pass back the drive to
the request handler when the first request completes, and you select a
new request from that very same drive without considering device
starvation? Any run of the cd writer queue would do nothing, since it
would just find the channel busy.

This sort of thing cannot be solved at the block layer. The two queues
are independent seen from that layer, the channel-busy dependency cannot
be solved there.

--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

IDE from current bk tree, UDMA and two channels...

Post by Petr Vandrove » Sat, 03 Aug 2002 02:20:06



Quote:

> Well OK this was my next idea, but apparently you already did the
> experient on your own. Thanks for the result. I'm still scratching my
> head and I have already observed this before myself.
> It's always funny to see what happens when one stops a driver
> from deliberately disabling IRQs for eons of jiffies :-).

I finally managed to compile older kernels, and I found that
2.5.27 (and 2.4.19-rc1 and 2.5.26) works fine (modulo endless loop
in ide_do_request... but it takes at least 5 minutes to trigger it),
while 2.5.28 dies in one second with UDMA status 0x25 (irq requested,
transfer in progress) and IDE status 0x58 (drq asserted).

Because of only change in IDE system between 2.5.27 and 2.5.28 is
renaming __save_flags => local_save_flags, fixing get_request for
ioctl commands (so 2.5.28 should be correct while 2.5.27 is not),
and moving some ioctls around, it looks like that problem is triggered
by something else.

I currently suspect IRQ handling changes, but maybe someone has
better idea? Also, I cannot reproduce problem with Seagate UDMA66
drive switched to UDMA33 mode, so it looks like that problem is
timming/firmware (Toshiba MK6409MAV) dependent.

And I did all these tests with UP kernel, just to eliminate cli/sti
changes.
                                            Petr Vandrovec

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

IDE from current bk tree, UDMA and two channels...

Post by Petr Vandrove » Sat, 03 Aug 2002 07:10:04




> > Well OK this was my next idea, but apparently you already did the
> > experient on your own. Thanks for the result. I'm still scratching my
> > head and I have already observed this before myself.
> > It's always funny to see what happens when one stops a driver
> > from deliberately disabling IRQs for eons of jiffies :-).

> I currently suspect IRQ handling changes, but maybe someone has
> better idea? Also, I cannot reproduce problem with Seagate UDMA66
> drive switched to UDMA33 mode, so it looks like that problem is
> timming/firmware (Toshiba MK6409MAV) dependent.

I'd like to apologize to Ingo, his changes were completely innocent.
Problem was triggered by Al's 'block device size cleanups' (currently
cset 1.403.160.5 on bkbits).

Before this change, my system was using 4KB block size when reading
from /dev/hdc1, because of blk_size[][] (which is in 1kB units) of this
partition was multiple of 2, and so i_size % 4096 was 0.  But after
Al's change partition size is read from gendisk, and not from blk_size,
and gendisk partition size is in 512 bytes units: and, as you can
probably guess, now my partition had i_size % 4096 == 512, and so only
512 byte block size was choosen. And with 512 bytes block size my
harddisk refuses to cooperate.

I was trying to find reason in code, why 512 byte block size should
not work, but I was not able to reveal any. Maybe I/O gurus here
will know?

For now, I'm using patch below. It fixes problems for me, block size = 1024
is sufficient in my configuration. If you have any insights what can be
a problem, please tell me. Problem apparently is not in i_size not being
multiple of 1024: without changing bsize problem still occurs, even if
I shrink i_size down to be multiple of 4K.

After some more testing I found that my other drive (120GB WD) handles
bsize=512 quite happily, so it looks like that just my Toshiba disk
does not like 512B back to back transfers?! Are there any plans to
read from block devices in 4KB blocks for all reads/writes except for
the last partial page?
                                        Thanks,
                                                Petr Vandrovec

--- linux-2.5.29-c548/fs/block_dev.c.orig       2002-07-31 12:48:23.000000000 +0200

                                break;
                        bsize <<= 1;
                }
+               if (bsize == 512) {
+                       printk(KERN_ERR "Found 512b device! Using larger block size...\n");
+                       bdev->bd_inode->i_size -= 512;
+                       bsize = 1024;
+               }
                bdev->bd_block_size = bsize;
                bdev->bd_inode->i_blkbits = blksize_bits(bsize);
                if (p->queue)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

IDE from current bk tree, UDMA and two channels...

Post by Marcin Daleck » Sat, 03 Aug 2002 07:30:06


Uz.ytkownik Petr Vandrovec napisa?:



>>>Well OK this was my next idea, but apparently you already did the
>>>experient on your own. Thanks for the result. I'm still scratching my
>>>head and I have already observed this before myself.
>>>It's always funny to see what happens when one stops a driver
>>>from deliberately disabling IRQs for eons of jiffies :-).

>>I currently suspect IRQ handling changes, but maybe someone has
>>better idea? Also, I cannot reproduce problem with Seagate UDMA66
>>drive switched to UDMA33 mode, so it looks like that problem is
>>timming/firmware (Toshiba MK6409MAV) dependent.

> I'd like to apologize to Ingo, his changes were completely innocent.
> Problem was triggered by Al's 'block device size cleanups' (currently
> cset 1.403.160.5 on bkbits).

> Before this change, my system was using 4KB block size when reading
> from /dev/hdc1, because of blk_size[][] (which is in 1kB units) of this
> partition was multiple of 2, and so i_size % 4096 was 0.  But after
> Al's change partition size is read from gendisk, and not from blk_size,
> and gendisk partition size is in 512 bytes units: and, as you can
> probably guess, now my partition had i_size % 4096 == 512, and so only
> 512 byte block size was choosen. And with 512 bytes block size my
> harddisk refuses to cooperate.

> I was trying to find reason in code, why 512 byte block size should
> not work, but I was not able to reveal any. Maybe I/O gurus here
> will know?

Petr. First - I wish to express my respect (for whatever it's
worth). Once again you are fscking sharp and up the point in problem
analysis.

For what few things I know about the situation is that the SATA
people are having great problems with the mediocre physical sector size
and they are pushing hard toward bigger sector
sizes. This may explain a bit why there is a propability why one
should be awake in this area.

Would you mind sending me hdparm -i /dev/hdx and hdparm -I /dev/hdx
for documentation purposes? The host controller chip could be the
one to blame as well.

I fear the need for jet another black list.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

1. Nightly regression runs against current bk tree

Sorry I didn't see this sooner, I'm unsubscribed for the moment until my
email provider can get exim/procmail talking nicely.

LTP has had a mailing list for a long time that is explicitly for the
purpose of posting results.  It's currently underutilized so I'd love to
see more results getting posted there again.  Please consider using that
one for posting results of all types (LTP and non-ltp)


Thanks,
Paul Larson

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2. Network card help - Urgent

3. bad: schedule() with irqs disabled! - current bk tree

4. QDS ? loadbalance ?

5. fix UP links - current bk tree

6. how to setup environment variables

7. PANIC caused by dequeue_signal() in current Linus BK tree

8. Dos window on X desktop?????

9. pfn-Functionset out of order for sparc64 in current Bk tree?

10. fix current BK tree compilation with devfs enabled

11. BUG: Current 2.5-BK tree dies on boot!

12. Nightly regression runs against current bk tree

13. BK Tree rev 1.910 ide-scsi.c compile fails