epoll (was Re: [PATCH] async poll for 2.5)

epoll (was Re: [PATCH] async poll for 2.5)

Post by Davide Libenz » Fri, 18 Oct 2002 01:50:06





> >I told you did not understand the API, this code won't work for edge
> >triggered APIs.

> Nonsense.  If you wish to make such a claim, you need to provide an
> example of a situation in which it won't work.

Your welcome. This is your code :

for (;;) {
     fd = event_wait(...);
     while (do_io(fd) != EAGAIN);

Quote:}

If the I/O space is not exhausted when you call event_wait(...); you'll
never receive the event because you'll be waiting a 0->1 transaction
without bringing the signal to 0 ( I/O space exhausted ). That one is a
typical use of poll() - select() - /dev/poll and you showed pretty clearly
that you do not seem to understand edge triggered event APIs. If you code
your I/O function like :

int my_io(...) {

        if (event_wait(...))
                do_io(...);

Quote:}

and you consume only part of the I/O space with the first call to my_io(),
the second call will block _infinitely_.

- Davide

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

epoll (was Re: [PATCH] async poll for 2.5)

Post by Davide Libenz » Fri, 18 Oct 2002 02:20:08




> >Not to enter into any of the other discussions on this issue, I wouldn't
> >usually do what you suggest above. [...] if I did I
> >a recv() or read() of 2K, and I only received 1K, there is no reason why
> >another system call should be invoked on the resource that likely will not
> >have any data ready.

> You're into the minutiae here.  Sure, you can optimize the read() in
> some cases, but Mr. Libenzi's example of a correct code scheme is no
> better than mine when it comes to this.

The poll()-like code :

int my_io(...) {

        if (poll(...))
                do_io(...);

Quote:}

The epoll-like code :

int my_io(...) {

        while (do_io(...) == EAGAIN)
                event_wait(...);

Quote:}

I would say that the epoll-like code generates less system calls because
if you call my_io() by processing small chunks of the I/O space, the
epoll-like code will generate only one system call while the poll()-like
code two. In case of I/O that ends up in wait the poll()-like code
generate two system calls while epoll-like code three. Globally the number
of system calls are about the same and from a performance point of view
/dev/epoll looks "pretty good" ( see /dev/epoll page ).

- Davide

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

epoll (was Re: [PATCH] async poll for 2.5)

Post by Davide Libenz » Fri, 18 Oct 2002 20:30:12




> >>Nonsense.  If you wish to make such a claim, you need to provide an
> >>example of a situation in which it won't work.

> >Your welcome. This is your code :

> >for (;;) {
> >     fd = event_wait(...);
> >     while (do_io(fd) != EAGAIN);
> >}

> >If the I/O space is not exhausted when you call event_wait(...); you'll
> >never receive the event because you'll be waiting a 0->1 transaction
> >without bringing the signal to 0 ( I/O space exhausted ).

> My code above does exhaust the I/O space.

Look, I'm usually very polite but you're really wasting my time. You
should know that an instruction at line N is usually executed before an
instruction at line N+1. Now this IS your code :

[N-1] for (;;) {
[N  ]     fd = event_wait(...);
[N+1]     while (do_io(fd) != EAGAIN);
[N+2} }

I will leave you as an exercise to understand what happens when you call
the first event_wait(...); and there is still data to be read/write on the
file descriptor. The reason you're asking /dev/epoll to drop an event at
fd insertion time shows very clearly that you're going to use the API is
the WRONG way and that you do not understand how such APIs works. And the
fact that there're users currently using the rt-sig and epoll APIs means
that either those guys are genius or you're missing something.

- Davide

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

epoll (was Re: [PATCH] async poll for 2.5)

Post by John Gardiner Myer » Sat, 19 Oct 2002 21:10:12



>Look, I'm usually very polite but you're really wasting my time. You
>should know that an instruction at line N is usually executed before an
>instruction at line N+1. Now this IS your code :

>[N-1] for (;;) {
>[N  ]     fd = event_wait(...);
>[N+1]     while (do_io(fd) != EAGAIN);
>[N+2} }

>I will leave you as an exercise to understand what happens when you call
>the first event_wait(...); and there is still data to be read/write on the
>file descriptor.

Your claim was that even if the API will drop an event at registration
time, my code scheme would not work.  Thus, we can take "the API will
drop an event at registration time" as postulated.  That being
postulated, if there is still data to be read/written on the file
descriptor then the first event_wait will return immediately.

In fact, given that postulate and the appropriate axioms about the
behavior of event_wait() and do_io(), one can prove that my code scheme
is equivalent to yours.  The logical conclusion from that and your claim
would be that you don't understand how edge triggered APIs have to be used.

Quote:>The reason you're asking /dev/epoll to drop an event at
>fd insertion time shows very clearly that you're going to use the API is
>the WRONG way and that you do not understand how such APIs works.

The wrong way as defined by what?  Having /dev/epoll drop appropriate
events at registration time permits a useful simplification/optimization
and makes the system significantly less prone to subtle progamming errors.

I do understand how such APIs work, to the extent that I am pointing out
a flaw in their current models.

Quote:>And the fact that there're users currently using the rt-sig and epoll APIs means
>that either those guys are genius or you're missing something.

Nonsense.  People are able to use flawed APIs all of the time.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

epoll (was Re: [PATCH] async poll for 2.5)

Post by Davide Libenz » Sat, 19 Oct 2002 21:50:08



> Your claim was that even if the API will drop an event at registration
> time, my code scheme would not work.  Thus, we can take "the API will
> drop an event at registration time" as postulated.  That being
> postulated, if there is still data to be read/written on the file
> descriptor then the first event_wait will return immediately.

> In fact, given that postulate and the appropriate axioms about the
> behavior of event_wait() and do_io(), one can prove that my code scheme
> is equivalent to yours.  The logical conclusion from that and your claim
> would be that you don't understand how edge triggered APIs have to be used.

No, the concept of edge triggered APIs is that you have to use the fd
until EAGAIN. It's a very simple concept. That means that after a
connect()/accept() you have to start using the fd because I/O space might
be available for read()/write(). Dropping an event is an attempt of using
the API like poll() & Co., where after an fd born, it is put inside the
set to be later wake up. You're basically saying "the kernel should drop an
event at creation time" and I'm saying that, to keep the API usage
consistent to "use the fd until EAGAIN", you have to use the fd as soon as
it'll become available.

Quote:> >The reason you're asking /dev/epoll to drop an event at
> >fd insertion time shows very clearly that you're going to use the API is
> >the WRONG way and that you do not understand how such APIs works.

> The wrong way as defined by what?  Having /dev/epoll drop appropriate
> events at registration time permits a useful simplification/optimization
> and makes the system significantly less prone to subtle progamming errors.

> I do understand how such APIs work, to the extent that I am pointing out
> a flaw in their current models.

I'm sorry but why do you want to sell your mistakes for API flaws ?

- Davide

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

epoll (was Re: [PATCH] async poll for 2.5)

Post by Charles 'Buck' Krasi » Sat, 19 Oct 2002 23:10:08


Quote:> >[N-1] for (;;) {
> >[N  ]     fd = event_wait(...);
> >[N+1]     while (do_io(fd) != EAGAIN);
> >[N+2} }

I'm getting confused over what minute details are being disputed here.

This debate might get clearer, to me anyway, if the example code
fragments were more concrete.

So if anybody still cares at this point, here is my stab at clarifying
some things.

PART I:  THE RACE

Suppose we have the following:

1 for(;;) {
2      fd = event_wait(...);
3      if(fd == my_listen_fd) {
4           /* new connections */
5           while((new_fd = my_accept(my_listen_fd, ...) != EAGAIN))
6                   epoll_addf(new_fd, ...);
7       } else {
8           /* established connections */
9           while(do_io(fd) != EAGAIN)
10      }
11 }

With the current epoll/rtsig semantics, there is a race condition
above.  I think this essentially the same race condition as the
snippet at the top of this message.  

Just to be clear, I walk completely through the steps in the race
scenario, as follows.

We start with our application blocked in line 2.  

A new connection is initiated by the application on other side.

The kernels exchange SYNs, causing the connection to be established.

The kernel on our side queues the new connection, waiting for the
application on this side to call accept().  In the process it fires an
edge POLLIN on the listen_fd, which wakes up the kernel side of line
2.  However, some time may pass before we actually wake up.

Meanwhile, the other side immediately sends some application level
data. The other side is going to wait for us to read the application
level data and respond.  So it is now blocked.

All of this happens before our application runs line 5 to pick up the
new connection from the kernel.  

Here comes the race:

Before we reach line 6, new_fd is not in epoll mode, so packet
arrivals do not trigger a POLLIN edge notfication on new_fd.

After line 6, there will be no data from the other side, so there will
still be no POLLIN edge notification for new_fd.

Therefore, line 2 will never yield a POLLIN event for new_fd, and the
new connection is now deadlocked.

Is this the kind of race we're talking about?

If so, we proceed as follows.

PART 2: SOLUTIONS

A race free alternative to write the code above is as follows.  Only
one new line (marked with *) is added.

1 for(;;) {
2      fd = event_wait(...);
3      if(fd == my_listen_fd) {
4           /* new connections */
5           while((new_fd = my_accept(my_listen_fd, ...) != EAGAIN)) {
6                    epoll_addf(new_fd, ...);
7*                   while(do_io(new_fd) != EAGAIN);
8           }
9       } else {
10           /* established connections */
11           while(do_io(fd) != EAGAIN)
12      }
13 }

The example above works with current epoll and rtsig semantics.  This
is just rephrasing what Davide has been saying: "Never call event_wait
without first ensuring that IO space is definitively exhausted".

Or we could have (to make John happier?):

1 for(;;) {
2      fd = event_wait(...);
3      if(fd == my_listen_fd) {
4           /* new connections */
5           while((new_fd = my_accept(my_listen_fd, ...) != EAGAIN)) {
6*                  epoll_addf(new_fd, &pfd, ...);
7*                  if(pfd.revents & POLLIN) {
7*                      while(do_io(new_fd) != EAGAIN);
8*                  }
8           }
9       } else {
10           /* established connections */
11           while(do_io(fd) != EAGAIN)
12      }
13 }

Here, epoll_addf primitive has been modified to return the initial
status.  Presumably so we avoid the first call to do_io if there is
nothing to do yet.

If it's easy to do (change add primitive that is), why not?

The first solution works either way.

-- Buck

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

epoll (was Re: [PATCH] async poll for 2.5)

Post by Davide Libenz » Sat, 19 Oct 2002 23:30:12



Quote:> I'm getting confused over what minute details are being disputed here.

> This debate might get clearer, to me anyway, if the example code
> fragments were more concrete.

> So if anybody still cares at this point, here is my stab at clarifying
> some things.

> PART I:  THE RACE

> Suppose we have the following:

> 1 for(;;) {
> 2      fd = event_wait(...);
> 3      if(fd == my_listen_fd) {
> 4           /* new connections */
> 5           while((new_fd = my_accept(my_listen_fd, ...) != EAGAIN))
> 6                   epoll_addf(new_fd, ...);
> 7       } else {
> 8           /* established connections */
> 9           while(do_io(fd) != EAGAIN)
> 10      }
> 11 }

> With the current epoll/rtsig semantics, there is a race condition
> above.  I think this essentially the same race condition as the
> snippet at the top of this message.

> Just to be clear, I walk completely through the steps in the race
> scenario, as follows.

> We start with our application blocked in line 2.

> A new connection is initiated by the application on other side.

> The kernels exchange SYNs, causing the connection to be established.

> The kernel on our side queues the new connection, waiting for the
> application on this side to call accept().  In the process it fires an
> edge POLLIN on the listen_fd, which wakes up the kernel side of line
> 2.  However, some time may pass before we actually wake up.

> Meanwhile, the other side immediately sends some application level
> data. The other side is going to wait for us to read the application
> level data and respond.  So it is now blocked.

> All of this happens before our application runs line 5 to pick up the
> new connection from the kernel.

> Here comes the race:

> Before we reach line 6, new_fd is not in epoll mode, so packet
> arrivals do not trigger a POLLIN edge notfication on new_fd.

> After line 6, there will be no data from the other side, so there will
> still be no POLLIN edge notification for new_fd.

> Therefore, line 2 will never yield a POLLIN event for new_fd, and the
> new connection is now deadlocked.

> Is this the kind of race we're talking about?

Exactly, you're going to wait for an event w/out having consumed the
possibly available I/O space.

- Show quoted text -

Quote:> If so, we proceed as follows.

> PART 2: SOLUTIONS

> A race free alternative to write the code above is as follows.  Only
> one new line (marked with *) is added.

> 1 for(;;) {
> 2      fd = event_wait(...);
> 3      if(fd == my_listen_fd) {
> 4           /* new connections */
> 5           while((new_fd = my_accept(my_listen_fd, ...) != EAGAIN)) {
> 6                    epoll_addf(new_fd, ...);
> 7*                   while(do_io(new_fd) != EAGAIN);
> 8           }
> 9       } else {
> 10           /* established connections */
> 11           while(do_io(fd) != EAGAIN)
> 12      }
> 13 }

Exactly, this is the sketchy solution ( but event_wait() return more than
one fd though ).

- Davide

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

epoll (was Re: [PATCH] async poll for 2.5)

Post by John Mye » Sun, 20 Oct 2002 03:00:10



>No, the concept of edge triggered APIs is that you have to use the fd
>until EAGAIN.

Which my code does, given the postulate.

Quote:>It's a very simple concept. That means that after a
>connect()/accept() you have to start using the fd because I/O space might
>be available for read()/write(). Dropping an event is an attempt of using
>the API like poll() & Co., where after an fd born, it is put inside the
>set to be later wake up. You're basically saying "the kernel should drop an
>event at creation time" and I'm saying that, to keep the API usage
>consistent to "use the fd until EAGAIN", you have to use the fd as soon as
>it'll become available.

Here's where your argument is inconsistent with the Linux philosophy.

Linux has a strong philosophy of practicality.  The goal of Linux is to
do useful things, including provide applications with the semantics they
need to do useful things.  The criteria for deciding what goes into
Linux is heavily weighted towards what works best in practice.

Whether or not some API matches someone's Platonic ideal of of an OS
interface is not a criterion.  In Linux, APIs are judged by their
practical merits.  This is why Linux does not have such things as
message passing and separate address spaces for drivers.

So whether or not a proposed set of epoll semantics is consistent with
your Platonic ideal of "use the fd until EAGAIN" is simply not an issue.
 What matters is what works best in practice.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

epoll (was Re: [PATCH] async poll for 2.5)

Post by John Mye » Sun, 20 Oct 2002 03:10:06



>Or we could have (to make John happier?):

>1 for(;;) {
>2      fd = event_wait(...);
>3      if(fd == my_listen_fd) {
>4           /* new connections */
>5           while((new_fd = my_accept(my_listen_fd, ...) != EAGAIN)) {
>6*                  epoll_addf(new_fd, &pfd, ...);
>7*                  if(pfd.revents & POLLIN) {
>7*                      while(do_io(new_fd) != EAGAIN);
>8*                  }
>8           }
>9       } else {
>10           /* established connections */
>11           while(do_io(fd) != EAGAIN)
>12      }
>13 }

Close.  What we would have is a modification of the epoll_addf()
semantics such that it would have an additional postcondition that if
the new_fd is in the ready state (has data available) then at least one
notification has been generated.  In the code above, the three lines
comprising the if statement labeled "7*" would be removed.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

epoll (was Re: [PATCH] async poll for 2.5)

Post by Tervel Atanasso » Sun, 20 Oct 2002 03:30:14


I am just joining your discussion today for the*time.  I come from
a Windows implementation of async I/O, so please don't hold it against
me.  I can't say that I am following 100% percent, but I think you guys
are talking about what the user API will look like, correct?

Assuming the answer is yes.  Here are my two cents.  The code you have
below seems a bit awkward -- the line while(do_io(fd) != EAGAIN) appears
twice.  I think the reason for that is that you're trying to do too many
things at once, namely, you're trying to handle both the initial
accept/setup of the socket and its steady state servicing.  I don't see
any benefit to that -- it definitely doesn't make for cleaner code.  Why
not do things separately.

1.  Have a setup phase which more or less does:

*  listen()
*  accept()
*  add the new fd/socket to an "event" which all the worker threads are
waiting on.

2.  Have the worker tread/steady state operation be:

*  event_wait() which returns the fd, some descriptor of what exactly
happened (read/write), the number of bytes transferred.
*  based upon the return from event wait the user updates his state, and
posts the next operation (read/write).

Thanks,

Tervel Atanassov

-----Original Message-----

Behalf Of John Myers
Sent: Friday, October 18, 2002 6:05 PM
To: Charles 'Buck' Krasic
Cc: Davide Libenzi; Benjamin LaHaise; Dan Kegel; Shailabh Nagar;
linux-kernel; linux-aio; Andrew Morton; David Miller; Linus Torvalds;
Stephen Tweedie
Subject: Re: epoll (was Re: [PATCH] async poll for 2.5)


>Or we could have (to make John happier?):

>1 for(;;) {
>2      fd = event_wait(...);
>3      if(fd == my_listen_fd) {
>4           /* new connections */
>5           while((new_fd = my_accept(my_listen_fd, ...) != EAGAIN)) {
>6*                  epoll_addf(new_fd, &pfd, ...);
>7*                  if(pfd.revents & POLLIN) {
>7*                      while(do_io(new_fd) != EAGAIN);
>8*                  }
>8           }
>9       } else {
>10           /* established connections */
>11           while(do_io(fd) != EAGAIN)
>12      }
>13 }

Close.  What we would have is a modification of the epoll_addf()
semantics such that it would have an additional postcondition that if
the new_fd is in the ready state (has data available) then at least one
notification has been generated.  In the code above, the three lines
comprising the if statement labeled "7*" would be removed.

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in

see: http://www.veryComputer.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://www.veryComputer.com/
Please read the FAQ at  http://www.veryComputer.com/

 
 
 

epoll (was Re: [PATCH] async poll for 2.5)

Post by Charles 'Buck' Krasi » Sun, 20 Oct 2002 06:10:06



> Close.  What we would have is a modification of the epoll_addf()
> semantics such that it would have an additional postcondition that if
> the new_fd is in the ready state (has data available) then at least
> one notification has been generated.  In the code above, the three
> lines comprising the if statement labeled "7*" would be removed.

I see.

I assume the kernel implementation is no big deal: epoll_addf() has to
call the kernel internal equivalent to poll() with a zero timeout.

This wouldn't break the first "solution" in my earlier post, but it
would cause every new connection to experience one extra EAGAIN.  

I see three possibilities:

  1) keep the current epoll_addf()
  2) modify it as John suggests, posting the initial ready state in
     the next epoll_getevents()
  3) both: add an option to epoll_addf() that says which of 1 or 2 is desired.

-- Buck

How hard would it be to modify the current epoll code to work that
way?  I'd assume it's just a matter having epoll_addf call the legacy
poll() code to check the condition (with a zero timeout).

-- Buck

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

epoll (was Re: [PATCH] async poll for 2.5)

Post by Davide Libenz » Sun, 20 Oct 2002 07:40:06



> >It's a very simple concept. That means that after a
> >connect()/accept() you have to start using the fd because I/O space might
> >be available for read()/write(). Dropping an event is an attempt of using
> >the API like poll() & Co., where after an fd born, it is put inside the
> >set to be later wake up. You're basically saying "the kernel should drop an
> >event at creation time" and I'm saying that, to keep the API usage
> >consistent to "use the fd until EAGAIN", you have to use the fd as soon as
> >it'll become available.

> Here's where your argument is inconsistent with the Linux philosophy.

> Linux has a strong philosophy of practicality.  The goal of Linux is to
> do useful things, including provide applications with the semantics they
> need to do useful things.  The criteria for deciding what goes into
> Linux is heavily weighted towards what works best in practice.

> Whether or not some API matches someone's Platonic ideal of of an OS
> interface is not a criterion.  In Linux, APIs are judged by their
> practical merits.  This is why Linux does not have such things as
> message passing and separate address spaces for drivers.

> So whether or not a proposed set of epoll semantics is consistent with
> your Platonic ideal of "use the fd until EAGAIN" is simply not an issue.
>  What matters is what works best in practice.

Luckily enough, being the only one that wasted my time in those couple of
days arguing against the API semantic, you pretty much down in the list of
people that are able to decide what "works best in practice".

- Davide

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

epoll (was Re: [PATCH] async poll for 2.5)

Post by Mark Mielk » Sun, 20 Oct 2002 09:10:04



> So whether or not a proposed set of epoll semantics is consistent with
> your Platonic ideal of "use the fd until EAGAIN" is simply not an issue.
> What matters is what works best in practice.

From this side of the fence: One vote for "use the fd until EAGAIN" being
flawed. If I wanted a method of monopolizing the event loop with real time
priorities, I would implement real time priorities within the event loop.

mark

--

.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   |
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

epoll (was Re: [PATCH] async poll for 2.5)

Post by Davide Libenz » Sun, 20 Oct 2002 19:20:10




> > So whether or not a proposed set of epoll semantics is consistent with
> > your Platonic ideal of "use the fd until EAGAIN" is simply not an issue.
> > What matters is what works best in practice.

> >From this side of the fence: One vote for "use the fd until EAGAIN" being
> flawed. If I wanted a method of monopolizing the event loop with real time
> priorities, I would implement real time priorities within the event loop.

You don't need to "use the fd until EAGAIN", you can consume even only
byte out of 10000 and stop using the fd. As long as you keep such fd in
your ready-list. As soon as you receive an EAGAIN from that fd, you remove
it from your ready-list and the next time you'll go to fish for events it
will reemerge as soon as it'll have something for you. The concept is very
simple, "you don't have to go waiting for events for a given fd before
having consumed its I/O space".

- Davide

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

epoll (was Re: [PATCH] async poll for 2.5)

Post by Dan Kege » Sun, 20 Oct 2002 19:40:11




>>So whether or not a proposed set of epoll semantics is consistent with
>>your Platonic ideal of "use the fd until EAGAIN" is simply not an issue.
>>What matters is what works best in practice.

>>From this side of the fence: One vote for "use the fd until EAGAIN" being
> flawed. If I wanted a method of monopolizing the event loop with real time
> priorities, I would implement real time priorities within the event loop.

The choice I see is between:
1. re-arming the one-shot notification when the user gets EAGAIN
2. re-arming the one-shot notification when the user reads all the data
    that was waiting (such that the very next read would return EGAIN).

#1 is what Davide wants; I think John and Mark are arguing for #2.

I suspect that Davide would be happy with #2, but advises
programmers to read until EGAIN anyway just to make things clear.

If the programmer is smart enough to figure out how to do that without
hitting EAGAIN, that's fine.  Essentially, if he tries to get away
without getting an EAGAIN, and his program stalls because he didn't
read all the data that's available and thereby doesn't reset the
one-shot readiness event, it's his own damn fault, and he should
go back to using level-triggered techniques like classical poll()
or blocking i/o.

- Dan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/