make nbd working in 2.5.x

make nbd working in 2.5.x

Post by Petr Vandrove » Tue, 04 Mar 2003 19:40:07



Hi Pavel,
   we use nbd for our diskless systems, and it looks to me like that
it has some serious problems in 2.5.x... Can you apply this patch
and forward it to Linus?

There were:
* Missing disk's queue initialization
* Driver should use list_del_init: put_request now verifies
  that req->queuelist is empty, and list_del was incompatible
  with this.
* I converted nbd_end_request back to end_that_request_{first,last}
  as I saw no reason why driver should do it itself... and
  blk_put_request has no place under queue_lock, so apparently when
  semantic changed nobody went through drivers...
                                Thanks,
                                        Petr Vandrovec

diff -urdN linux/drivers/block/nbd.c linux/drivers/block/nbd.c
--- linux/drivers/block/nbd.c   2003-02-28 20:56:05.000000000 +0100

 {
        int uptodate = (req->errors == 0) ? 1 : 0;
        request_queue_t *q = req->q;
-       struct bio *bio;
-       unsigned nsect;
        unsigned long flags;

 #ifdef PARANOIA
        requests_out++;
 #endif
        spin_lock_irqsave(q->queue_lock, flags);
-       while((bio = req->bio) != NULL) {
-               nsect = bio_sectors(bio);
-               blk_finished_io(nsect);
-               req->bio = bio->bi_next;
-               bio->bi_next = NULL;
-               bio_endio(bio, nsect << 9, uptodate ? 0 : -EIO);
+       if (!end_that_request_first(req, uptodate, req->nr_sectors)) {
+               end_that_request_last(req);
        }
-       blk_put_request(req);
        spin_unlock_irqrestore(q->queue_lock, flags);
 }

                req = list_entry(tmp, struct request, queuelist);
                if (req != xreq)
                        continue;
-               list_del(&req->queuelist);
+               list_del_init(&req->queuelist);
                spin_unlock(&lo->queue_lock);
                return req;

                spin_lock(&lo->queue_lock);
                if (!list_empty(&lo->queue_head)) {
                        req = list_entry(lo->queue_head.next, struct request, queuelist);
-                       list_del(&req->queuelist);
+                       list_del_init(&req->queuelist);
                }
                spin_unlock(&lo->queue_lock);

                if (req->errors) {
                        printk(KERN_ERR "nbd: nbd_send_req failed\n");
                        spin_lock(&lo->queue_lock);
-                       list_del(&req->queuelist);
+                       list_del_init(&req->queuelist);
                        spin_unlock(&lo->queue_lock);
                        nbd_end_request(req);

                disk->first_minor = i;
                disk->fops = &nbd_fops;
                disk->private_data = &nbd_dev[i];
+               disk->queue = &nbd_queue;
                sprintf(disk->disk_name, "nbd%d", i);
                set_capacity(disk, 0x3ffffe);
                add_disk(disk);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

make nbd working in 2.5.x

Post by Pavel Mache » Tue, 04 Mar 2003 20:50:04


Hi!

Quote:>    we use nbd for our diskless systems, and it looks to me like that
> it has some serious problems in 2.5.x... Can you apply this patch
> and forward it to Linus?

> There were:
> * Missing disk's queue initialization
> * Driver should use list_del_init: put_request now verifies
>   that req->queuelist is empty, and list_del was incompatible
>   with this.
> * I converted nbd_end_request back to end_that_request_{first,last}
>   as I saw no reason why driver should do it itself... and
>   blk_put_request has no place under queue_lock, so apparently when
>   semantic changed nobody went through drivers...

I do not think this is good idea. I am not sure who converted it to
bio, but he surely had good reason to do that.

> diff -urdN linux/drivers/block/nbd.c linux/drivers/block/nbd.c
> --- linux/drivers/block/nbd.c      2003-02-28 20:56:05.000000000 +0100
> +++ linux/drivers/block/nbd.c      2003-03-01 22:53:36.000000000 +0100

>  {
>    int uptodate = (req->errors == 0) ? 1 : 0;
>    request_queue_t *q = req->q;
> -  struct bio *bio;
> -  unsigned nsect;
>    unsigned long flags;

>  #ifdef PARANOIA
>    requests_out++;
>  #endif
>    spin_lock_irqsave(q->queue_lock, flags);
> -  while((bio = req->bio) != NULL) {
> -          nsect = bio_sectors(bio);
> -          blk_finished_io(nsect);
> -          req->bio = bio->bi_next;
> -          bio->bi_next = NULL;
> -          bio_endio(bio, nsect << 9, uptodate ? 0 : -EIO);
> +  if (!end_that_request_first(req, uptodate, req->nr_sectors)) {
> +          end_that_request_last(req);
>    }
> -  blk_put_request(req);
>    spin_unlock_irqrestore(q->queue_lock, flags);
>  }

--
Horseback riding is like software...
...vgf orggre jura vgf serr.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

make nbd working in 2.5.x

Post by Petr Vandrove » Tue, 04 Mar 2003 21:00:17



Quote:> >    we use nbd for our diskless systems, and it looks to me like that
> > it has some serious problems in 2.5.x... Can you apply this patch
> > and forward it to Linus?

> > There were:
> > * Missing disk's queue initialization
> > * Driver should use list_del_init: put_request now verifies
> >   that req->queuelist is empty, and list_del was incompatible
> >   with this.
> > * I converted nbd_end_request back to end_that_request_{first,last}
> >   as I saw no reason why driver should do it itself... and
> >   blk_put_request has no place under queue_lock, so apparently when
> >   semantic changed nobody went through drivers...

> I do not think this is good idea. I am not sure who converted it to
> bio, but he surely had good reason to do that.

I think that at the beginning of 2.5.x series there was some thinking
about removing end_that_request* completely from the API. As it never
happened, and __end_that_request_first()/end_that_request_last() has
definitely better quality (like that it does not ignore req->waiting...)
than opencoded nbd loop, I prefer using end_that_request* over opencoding
bio traversal.

If you want, then just replace blk_put_request() with __blk_put_request(),
instead of first change. But I personally will not trust such code, as
next time something in bio changes nbd will miss this change again.
                                                    Petr Vandrovec

- Show quoted text -

> > diff -urdN linux/drivers/block/nbd.c linux/drivers/block/nbd.c
> > --- linux/drivers/block/nbd.c 2003-02-28 20:56:05.000000000 +0100
> > +++ linux/drivers/block/nbd.c 2003-03-01 22:53:36.000000000 +0100

> >  {
> >   int uptodate = (req->errors == 0) ? 1 : 0;
> >   request_queue_t *q = req->q;
> > - struct bio *bio;
> > - unsigned nsect;
> >   unsigned long flags;

> >  #ifdef PARANOIA
> >   requests_out++;
> >  #endif
> >   spin_lock_irqsave(q->queue_lock, flags);
> > - while((bio = req->bio) != NULL) {
> > -     nsect = bio_sectors(bio);
> > -     blk_finished_io(nsect);
> > -     req->bio = bio->bi_next;
> > -     bio->bi_next = NULL;
> > -     bio_endio(bio, nsect << 9, uptodate ? 0 : -EIO);
> > + if (!end_that_request_first(req, uptodate, req->nr_sectors)) {
> > +     end_that_request_last(req);
> >   }
> > - blk_put_request(req);
> >   spin_unlock_irqrestore(q->queue_lock, flags);
> >  }

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
 
 
 

make nbd working in 2.5.x

Post by Jens Axbo » Thu, 06 Mar 2003 11:30:12




> > >    we use nbd for our diskless systems, and it looks to me like that
> > > it has some serious problems in 2.5.x... Can you apply this patch
> > > and forward it to Linus?

> > > There were:
> > > * Missing disk's queue initialization
> > > * Driver should use list_del_init: put_request now verifies
> > >   that req->queuelist is empty, and list_del was incompatible
> > >   with this.
> > > * I converted nbd_end_request back to end_that_request_{first,last}
> > >   as I saw no reason why driver should do it itself... and
> > >   blk_put_request has no place under queue_lock, so apparently when
> > >   semantic changed nobody went through drivers...

> > I do not think this is good idea. I am not sure who converted it to
> > bio, but he surely had good reason to do that.

> I think that at the beginning of 2.5.x series there was some thinking
> about removing end_that_request* completely from the API. As it never
> happened, and __end_that_request_first()/end_that_request_last() has
> definitely better quality (like that it does not ignore req->waiting...)
> than opencoded nbd loop, I prefer using end_that_request* over opencoding
> bio traversal.

> If you want, then just replace blk_put_request() with __blk_put_request(),
> instead of first change. But I personally will not trust such code, as
> next time something in bio changes nbd will miss this change again.

I agree with the change, there's no reason for nbd to implement its own
end_request handling. I was the one to do the bio conversion, doing XXX
drivers at one time...

A small correction to your patch, you need not hold queue_lock when
calling end_that_request_first() (which is the costly part of ending a
request), so

        if (!end_that_request_first(req, uptodate, req->nr_sectors)) {
                unsigned long flags;

                spin_lock_irqsave(q->queue_lock, flags);
                end_that_request_last(req);
                spin_unlock_irqrestore(q->queue_lock, flags);
        }

would be enough. That depends on the driver having pulled the request
off the list in the first place, which nbd has.

Also, it looks like it would be much better to simply let the queue lock
for a nbd_device be inherited from ndb_device->lo_lock.

--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

make nbd working in 2.5.x

Post by Petr Vandrove » Thu, 06 Mar 2003 13:40:11





> > I think that at the beginning of 2.5.x series there was some thinking
> > about removing end_that_request* completely from the API. As it never
> > happened, and __end_that_request_first()/end_that_request_last() has
> > definitely better quality (like that it does not ignore req->waiting...)
> > than opencoded nbd loop, I prefer using end_that_request* over opencoding
> > bio traversal.

> > If you want, then just replace blk_put_request() with __blk_put_request(),
> > instead of first change. But I personally will not trust such code, as
> > next time something in bio changes nbd will miss this change again.

> I agree with the change, there's no reason for nbd to implement its own
> end_request handling. I was the one to do the bio conversion, doing XXX
> drivers at one time...

> A small correction to your patch, you need not hold queue_lock when
> calling end_that_request_first() (which is the costly part of ending a
> request), so

>     if (!end_that_request_first(req, uptodate, req->nr_sectors)) {
>         unsigned long flags;

>         spin_lock_irqsave(q->queue_lock, flags);
>         end_that_request_last(req);
>         spin_unlock_irqrestore(q->queue_lock, flags);
>     }

> would be enough. That depends on the driver having pulled the request
> off the list in the first place, which nbd has.

But it also finishes whole request at once, so probably with:

if (!end_that_request_first(...)) {
   ...

Quote:} else {
  BUG();
}

I had patch for 2.5.3 which finished request partially after each chunk
(usually 1500 bytes) received from server, but it did not make any
difference in performance at that time (probably because of the way
nbd server works and speed of network between server and client). I'll
try it now again...
                                                    Petr Vandrovec

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

make nbd working in 2.5.x

Post by Jens Axbo » Thu, 06 Mar 2003 13:50:07






> > > I think that at the beginning of 2.5.x series there was some thinking
> > > about removing end_that_request* completely from the API. As it never
> > > happened, and __end_that_request_first()/end_that_request_last() has
> > > definitely better quality (like that it does not ignore req->waiting...)
> > > than opencoded nbd loop, I prefer using end_that_request* over opencoding
> > > bio traversal.

> > > If you want, then just replace blk_put_request() with __blk_put_request(),
> > > instead of first change. But I personally will not trust such code, as
> > > next time something in bio changes nbd will miss this change again.

> > I agree with the change, there's no reason for nbd to implement its own
> > end_request handling. I was the one to do the bio conversion, doing XXX
> > drivers at one time...

> > A small correction to your patch, you need not hold queue_lock when
> > calling end_that_request_first() (which is the costly part of ending a
> > request), so

> >     if (!end_that_request_first(req, uptodate, req->nr_sectors)) {
> >         unsigned long flags;

> >         spin_lock_irqsave(q->queue_lock, flags);
> >         end_that_request_last(req);
> >         spin_unlock_irqrestore(q->queue_lock, flags);
> >     }

> > would be enough. That depends on the driver having pulled the request
> > off the list in the first place, which nbd has.

> But it also finishes whole request at once, so probably with:

> if (!end_that_request_first(...)) {
>    ...
> } else {
>   BUG();
> }

Sure

Quote:> I had patch for 2.5.3 which finished request partially after each chunk
> (usually 1500 bytes) received from server, but it did not make any
> difference in performance at that time (probably because of the way
> nbd server works and speed of network between server and client). I'll
> try it now again...

Yes that might still make sense, especially now since we actually pass
down partially completed chunks. But the bio end_io must support it, or
you will see now difference at all. And I don't think any of them do :).
Linus played with adding it to the multi-page fs helpers, but I think he
abandoned it. Should make larger read-aheads on slow media (floppy) work
a lot nicer, though.

--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/