Is this little POSIX-threads example deadlock and race-condition free?

Is this little POSIX-threads example deadlock and race-condition free?

Post by Marc Rochki » Sat, 25 Jan 2003 06:24:40



Below is a little example program I wrote (for a book I'm working on)
that consists of two threads. One reads words from the user and adds
them to an array; the second sorts the array every second or so, as
needed. The user can type a - to see the sorted list. Here's an
example run:

$ pgm
Word? zzz
Word? aaa
Word? ddd
Word? -
Not sorted -- try again later
Word? -
        aaa
        ddd
        zzz
Word? <EOT>

Even though both threads are operating on the same data structure,
I've arranged things in such a way that no semaphores are needed.

Or, so I think! ;-)

The question is, have I missed anything?

-------------------------------------------

/*
    In the main function, words are read and added to an array. A
    separate thread sorts the words every second or so, if needed.

    There are no semaphores.

    Question: Is this program free of deadlocks and race conditions?
*/

#include "defs.h" /* contains includes, etc. */
#include <pthread.h>

#define MAX_ENTRIES 100

struct dict {
    int d_nentries;
    char *d_entry[MAX_ENTRIES];
    bool d_issorted;

Quote:};

/*
    NOTE: In the original program, the error handling is much more
    sophisticated than just printing a message and terminating. This
    simpler approach has been introduced for posting on
    comp.unix.programmer.
*/
static void fatal(const char *msg)
{
    fprintf(stderr, "%s\n", msg);
    exit(EXIT_FAILURE);

Quote:}

static int compar(const void *x, const void *y)
{
    return strcmp(*(char **)x, *(char **)y);

Quote:}

static void *sorter(void *arg)
{
    struct dict *d = arg;

    while (true) {
        if (!d->d_issorted) {
            qsort(d->d_entry, d->d_nentries, sizeof(char *), compar);
            d->d_issorted = true;
        }
        sleep(1);
    }

Quote:}

int main(void)
{
    struct dict dictionary;
    pthread_t tid;
    char word[100];
    int i;
    size_t wordlen;

    dictionary.d_nentries = 0;
    dictionary.d_issorted = true;
    if (pthread_create(&tid, NULL, sorter, &dictionary) != 0)
        fatal("pthread_create");
    while (!feof(stdin)) {
        printf("Word? ");
        if (fgets(word, sizeof(word), stdin) == NULL && ferror(stdin))
            fatal("fgets");
        if (word[0] == '-') {
            if (dictionary.d_issorted)
                for (i = 0; i < dictionary.d_nentries; i++)
                    printf("\t%s\n", dictionary.d_entry[i]);
            else
                printf("Not sorted -- try again later\n");
            continue;
        }
        wordlen = strlen(word);
        if (word[wordlen - 1] == '\n')
            word[--wordlen] = '\0';
        if (dictionary.d_nentries >= MAX_ENTRIES)
            fatal("too many words");
        if ((dictionary.d_entry[dictionary.d_nentries] =
          malloc(wordlen + 1)) == NULL)
            fatal("malloc");
        strcpy(dictionary.d_entry[dictionary.d_nentries], word);
        /*
            Make entry visible to sorter. Value of d_issorted is
            unknown.
        */
        dictionary.d_nentries++;
        /*
            Force sort. May be a redundant one because sort may
            occur between previous statement and next. In this case
            an extra sort will occur later.
        */
        dictionary.d_issorted = false;
    }
    return EXIT_SUCCESS;

Quote:}

 
 
 

Is this little POSIX-threads example deadlock and race-condition free?

Post by Barry Margoli » Sat, 25 Jan 2003 06:56:33




>Below is a little example program I wrote (for a book I'm working on)
>that consists of two threads. One reads words from the user and adds
>them to an array; the second sorts the array every second or so, as
>needed. The user can type a - to see the sorted list. Here's an
>example run:

>$ pgm
>Word? zzz
>Word? aaa
>Word? ddd
>Word? -
>Not sorted -- try again later
>Word? -
>        aaa
>        ddd
>        zzz
>Word? <EOT>

>Even though both threads are operating on the same data structure,
>I've arranged things in such a way that no semaphores are needed.

>Or, so I think! ;-)

>The question is, have I missed anything?

Theoretically, dictionary.d_nentries++ could be performed in multiple steps
that leave invalid intermediate values in the variable (perhaps the
hardware requires updating each byte separately).  If a sort takes place
while this is happening, it could access uninitialized entries in
dictionary.d_entry[].

In fact, theoretically the assignment diction.d_issorted=false could also
leave incorrect intermediate values, although this is *very* unlikely.

--

Genuity, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

 
 
 

Is this little POSIX-threads example deadlock and race-condition free?

Post by Eric Sosma » Sat, 25 Jan 2003 07:27:25



> Below is a little example program I wrote (for a book I'm working on)
> that consists of two threads. One reads words from the user and adds
> them to an array; the second sorts the array every second or so, as
> needed. The user can type a - to see the sorted list.
> [...]
> Even though both threads are operating on the same data structure,
> I've arranged things in such a way that no semaphores are needed.

> Or, so I think! ;-)

> The question is, have I missed anything?

    Two things, at any rate -- perhaps more.

    First, there's a race condition involving the d_issorted
flag.  The reader thread sets it false after adding a new
word, while the sorter thread sets it true after completing
a sort.  These manipulations are not synchronized, so you
could get a sequence of events like

        Sorter: Wakes up, tests d_issorted, finds it false,
        and begins sorting.

        Context switch: Sorter is suspended, Reader awakens.

        Reader: Adds a word to the end of the array, sets
        d_issorted false, goes back to sleep.

        Sorter: Awakens, finishes sorting, sets d_issorted
        true.

... leaving d_issorted true even though the final word has
not yet been sorted into its proper place.

    Second, there are memory visibility and ordering issues,
which are very likely to afflict your code if it ever runs
on a multiprocessor system.  Suffice it to say that modern
CPUs and caches tend to do things like rearrange memory
operations for greater speed: even though a program running
on CPU 1 writes to location X before writing to location Y,
a program on CPU 2 may see Y take on its new value while X
still has the old one.  Part of the function of the POSIX
synchronization primitives is to ensure "proper" ordering
of this sort of thing, often by use of various kinds of
"memory barrier" instructions.  Without such provisions,
there's simply no telling how long it will take before CPU 2
observes a change to the data in location X.  Hence, even
though the reader thread carefully fills in the new array
slot before incrementing d_entries, the sorter thread on a
different CPU may see the new d_entries value while it's
still seeing the old array slot, and hence stumble across
bad data.

    "Don't do that."

--

 
 
 

Is this little POSIX-threads example deadlock and race-condition free?

Post by Joseph Seig » Sat, 25 Jan 2003 08:55:30



> Below is a little example program I wrote (for a book I'm working on)
> that consists of two threads. One reads words from the user and adds
> them to an array; the second sorts the array every second or so, as
> needed. The user can type a - to see the sorted list. Here's an
> example run:

> $ pgm
> Word? zzz
> Word? aaa
> Word? ddd
> Word? -
> Not sorted -- try again later
> Word? -
>         aaa
>         ddd
>         zzz
> Word? <EOT>

> Even though both threads are operating on the same data structure,
> I've arranged things in such a way that no semaphores are needed.

> Or, so I think! ;-)

> The question is, have I missed anything?

Yes.

For one thing, you are not guaranteed that your storage accesses actually
occur in the order that you specify in the program.  Your program logic
depends on the accesses happening in a certain order.  Also you are
assuming the accesses are atomic.  For int they usually are but you
are still assuming it.

Another thing is you may be printing out the sorted list while a sort is
in progress.  The intermediate values of the list are undefined while this
is happening.

Joe Seigh

 
 
 

Is this little POSIX-threads example deadlock and race-condition free?

Post by Barry Margoli » Sat, 25 Jan 2003 09:00:57




Quote:>Another thing is you may be printing out the sorted list while a sort is
>in progress.  The intermediate values of the list are undefined while this
>is happening.

Assuming the assignment order is *not* rearranged, his program didn't have
this problem.  It checks d_issorted before printing out the sorted list,
and doesn't print it if it's false.  Since d_issorted isn't set until after
the list has been sorted, the printing routine should never see
intermediate values of the list.

--

Genuity, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

 
 
 

Is this little POSIX-threads example deadlock and race-condition free?

Post by Joseph Seig » Sat, 25 Jan 2003 12:12:08





> >Another thing is you may be printing out the sorted list while a sort is
> >in progress.  The intermediate values of the list are undefined while this
> >is happening.

> Assuming the assignment order is *not* rearranged, his program didn't have
> this problem.  It checks d_issorted before printing out the sorted list,
> and doesn't print it if it's false.  Since d_issorted isn't set until after
> the list has been sorted, the printing routine should never see
> intermediate values of the list.

True if order is preserved by both compiler and processor.  Though there
are still issues.  A really late store of d_issorted to true could occur
after a number of words were added and d_issorted set to false.  Sort
would not run again.  Order could go something like

     main:                              sort:

                                        sort list

     add word
     nentries++
     d_issortd = false

                                       d_issortd = true

     print list

Joe Seigh

 
 
 

Is this little POSIX-threads example deadlock and race-condition free?

Post by Casper H.S. Di » Sat, 25 Jan 2003 19:51:52



>Below is a little example program I wrote (for a book I'm working on)
>that consists of two threads. One reads words from the user and adds
>them to an array; the second sorts the array every second or so, as
>needed. The user can type a - to see the sorted list. Here's an
>example run:
>        if ((dictionary.d_entry[dictionary.d_nentries] =
>          malloc(wordlen + 1)) == NULL)
>            fatal("malloc");
>        strcpy(dictionary.d_entry[dictionary.d_nentries], word);
>        /*
>            Make entry visible to sorter. Value of d_issorted is
>            unknown.
>        */
>        dictionary.d_nentries++;
>            Force sort. May be a redundant one because sort may
>            occur between previous statement and next. In this case
>            an extra sort will occur later.
>        */
>        dictionary.d_issorted = false;

There is no guarantee that the increment of d_nentries and the change
to d_issorted are visible to the other thread after you've copied the
word.

I.e., it is possible that:

        - you sort with an initialized word which may even cause an
          unstable sort (because the strings slowly becomes visible).

        - you may even do a strcmp on a NULL pointer because the store
          of the malloc return value is not visible when the sort
          starts.

Casper
--
Expressed in this posting are my opinions.  They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.

 
 
 

Is this little POSIX-threads example deadlock and race-condition free?

Post by Joseph Seig » Sat, 25 Jan 2003 21:14:48


Assuming that memory accesses are in order and accesses to int are atomic, you
can fix the race conditions by using a distributed algorithm, i.e. threads can
read but not write data owned by other threads.

Instead of d_issorted which is shared, you add another counter, d_nsorted, which
is the number of sorted entries.  Only the main thread can update d_nentries and
only the sort thread can update d_nsorted.  The following invariant holds

      d_nsortd <= d_nentries

Main thread logic is changed to not print sorted list of (d_nsorted < d_nentries).

Sort thread logic is changed to sort if (d_nsorted < d_nentries).  If it sorts it
must make a temporary local copy of d_nentries to ensure that number of words that
it sorts is same that d_nsorted is set to.  Also, 1 second is a long time, you
might want to change the if statement to a while statement, e.g.

          while (true) {
             while (d->d_nsorted < d->d_nentries) {  
                 temp_nsorted = d->d_nentries;
                 qsort(d->d_entry, temp_nsorted, ...)
                  d->d_nsorted = temp_nsorted;
             }
             sleep(1);
         }

Joe Seigh

 
 
 

Is this little POSIX-threads example deadlock and race-condition free?

Post by Marc Rochki » Sun, 26 Jan 2003 08:45:46


Thanks everyone! Very helpful stuff. I have retitled this program "A
Bad Example," and used it as an excuse to introduce mutexes.

--Marc

 
 
 

1. Race Condition -- Can someone explain in this example ?

Hi,

L0pht uncovered a race condition in a software offering and documented
their finding here

http://www.l0pht.com/advisories/ClearCase.txt

(sorry about link...doc is a somewhat lengthy) and I'm trying to figure
out where the race condition comes into play.  I am a fledgling C
programmer and I'm trying to learn how not to be a bad coder.

Thanks for any responses !

2. Disk clean on bootup, fsck says it's not clean

3. Solaris2.4 + 9GIG + NFS + SPARC20 = race-condition?

4. Ingres install?

5. POSIX threads, thread-specific data: what about the "main" thread?

6. accessing MS SQL server on NT from Linux

7. Posix-Threads on Linux

8. Problems with IPC under linux

9. panic: Deadlock condition bug?

10. Panic on Solaris: Deadlock condition detected: cycle in blocking chain

11. In/Output buffer race conditions

12. Race condition with MOD_INC_USE_COUNT?

13. Wait queues and a race condition on 2.2.x