Buffered file reads, and speeds of C prog vs. wc vs. awk vs. perl

Buffered file reads, and speeds of C prog vs. wc vs. awk vs. perl

Post by Ajay Sh » Wed, 01 Apr 1992 10:34:04



Earlier I had asked how to better buffer the reading of a file.
Many people replied; there is a function in ANSI C named setvbuf
which hooks a buffer (which you supply) for use with a FILE * you
specify.

H & S also say "you almost never need to modify the default buffering".
It looks like they are right in my (limited) timing tests.

So here are numbers:

1. C program, reading a file and merely counting lines, with default IO
        5.7u 2.1s 0:12 63% 0+144k 1494+1io 1494pf+0w
2. #1, with 10 Meg of buffer (through setvbuf)
        6.1u 2.9s 0:12 74% 0+9008k 629+0io 632pf+0w
3. gawk 'END {print NR}'
        3.6u 2.3s 0:07 74% 0+496k 561+0io 572pf+0w
4. echo 'END {print NF}' | a2p > a.perl; chmod +x a.perl; a.perl
        9.2u 10.3s 02 86% 0+760k 76+1io 122pf+0w
5. wc -l
        20.9u 2.0s 0:25 88% 0+280k 589+0io 591pf+0w

wc is a joke; gawk is great; perl is worse than awk; the default IO
of the C program is not improved by a 10 Meg buffer.

Compiler/flags for C program didn't matter much.  I tried /bin/cc -O4
and gcc v1.37 -O.

Info about the data file: it was 180,000 lines, roughly 18 Meg.

This computer:

OS/MP 4.1A.1 Export(S5GENERIC/root)#0: Mon Sep 30 16:11:19 1991
System type is Solbourne Series5e/900 with 128 MB of memory.

The C program:

#include <stdio.h>

int main(argc, argv)
     int argc;
     char **argv;
{
    int bufsize, lines=0;
    char *buffer, *s;
    FILE *f;

    if (argc != 2) {
        fprintf(stderr, "Usage: %s buffersize\n", argv[0]);
        fprintf(stderr, "buffersize = 0 ==> setvbuf() is NOT called.\n");
        return 1;
    }
    bufsize = atoi(argv[1]);
    buffer = (char *) malloc(bufsize*sizeof(char));
    f = fopen("bigfile", "r");
    if (bufsize != 0) setvbuf(f, buffer, _IOFBF, bufsize);

#define MAXCHARS 65536
    s = (char *) malloc(MAXCHARS*sizeof(char));
    while (NULL != fgets(s, MAXCHARS, f))
      ++lines;
    printf("%d\n", lines);
    return 0;

Quote:}

--


 
 
 

Buffered file reads, and speeds of C prog vs. wc vs. awk vs. perl

Post by Ajay Sh » Wed, 01 Apr 1992 11:24:56



>1. C program, reading a file and merely counting lines, with default IO
>        5.7u 2.1s 0:12 63% 0+144k 1494+1io 1494pf+0w
>2. #1, with 10 Meg of buffer (through setvbuf)
>        6.1u 2.9s 0:12 74% 0+9008k 629+0io 632pf+0w
>3. gawk 'END {print NR}'
>        3.6u 2.3s 0:07 74% 0+496k 561+0io 572pf+0w

I have to ask -- how does gawk do this faster than the very
stripped down C program?  The C program seems to haave NO overhead.

And why does giving him a 10 Meg buffer actually slow down things?

        -ans.
--



 
 
 

Buffered file reads, and speeds of C prog vs. wc vs. awk vs. perl

Post by Dave Eis » Wed, 01 Apr 1992 23:51:22




>>1. C program, reading a file and merely counting lines, with default IO
>>        5.7u 2.1s 0:12 63% 0+144k 1494+1io 1494pf+0w
>>3. gawk 'END {print NR}'
>>        3.6u 2.3s 0:07 74% 0+496k 561+0io 572pf+0w

>I have to ask -- how does gawk do this faster than the very
>stripped down C program?  The C program seems to haave NO overhead.

Sure it did. The C program used stdio which does a memory to memory
copy from the stdio buffer to the buffer passed to fgets. Gawk uses a
different buffering method, one that is more efficient in this case.

--

      There's something in my library to offend everybody.
        --- Washington Coalition Against Censorship

 
 
 

Buffered file reads, and speeds of C prog vs. wc vs. awk vs. perl

Post by Steve Nuch » Thu, 02 Apr 1992 00:48:30



>And why does giving him a 10 Meg buffer actually slow down things?

Page faults, probably.
--
Steve Nuchia      South Coast Computing Services, Inc.      (713) 661-3301
 
 
 

Buffered file reads, and speeds of C prog vs. wc vs. awk vs. perl

Post by goo » Thu, 02 Apr 1992 08:09:01


   1. C program, reading a file and merely counting lines, with default IO
           5.7u 2.1s 0:12 63% 0+144k 1494+1io 1494pf+0w
   2. #1, with 10 Meg of buffer (through setvbuf)
           6.1u 2.9s 0:12 74% 0+9008k 629+0io 632pf+0w
   3. gawk 'END {print NR}'
           3.6u 2.3s 0:07 74% 0+496k 561+0io 572pf+0w
   4. echo 'END {print NF}' | a2p > a.perl; chmod +x a.perl; a.perl
           9.2u 10.3s 02 86% 0+760k 76+1io 122pf+0w
   5. wc -l
           20.9u 2.0s 0:25 88% 0+280k 589+0io 591pf+0w

   wc is a joke; gawk is great; perl is worse than awk; the default IO
   of the C program is not improved by a 10 Meg buffer.

I would not go that far, but that's beside the point.

If you really want to write a fast version of wc, then you've gotta
access system calls directly. This is how I beat out everything that I
had avilable. (gawk got nuked here)

/* wc.c */
main()
{
  char buf[8 * 1024];
  char *bufp;
  char *bufe;
  int n, lines = 0;

  while ((n = read(0, buf, 8 * 1024)) > 0)
    {
      bufp = buf;
      bufe = buf + n;

      while (bufp < bufe)
        if (*bufp++ == '\n')
          lines++;
    }

  printf("%d\n", lines);

Quote:}

And the hacked, optomized version
/* wc2.c */
main()
{
  char buf[8 * 1024];
  char *bufp;
  char *bufe;
  int n, lines = 0;

  while ((n = read(0, buf, 8 * 1024)) > 0)
    {
      bufp = buf;
      bufe = buf + n;

      /* hi tom d! */
      while (bufp < bufe)
        switch ((bufe -bufp) % 8)
          {
          case 0: do { if (*bufp++ == '\n') lines++;
          case 1: if (*bufp++ == '\n') lines++;
          case 2: if (*bufp++ == '\n') lines++;
          case 3: if (*bufp++ == '\n') lines++;
          case 4: if (*bufp++ == '\n') lines++;
          case 5: if (*bufp++ == '\n') lines++;
          case 6: if (*bufp++ == '\n') lines++;
          case 7: if (*bufp++ == '\n') lines++;
                  } while (0);
          }

    }

  printf("%d\n", lines);

Quote:}

Here are the times for the biggest text file that I could find (awk
dies on binary files)

1) system wc: 2.8 real         1.3 user         0.5 sys

2) awk:       1.7 real         1.2 user         0.2 sys

3) perl:      3.3 real         0.7 user         1.0 sys

4) my wc:     1.1 real         0.9 user         0.1 sys

5) my wc2:    0.5 real         0.3 user         0.1 sys

ewq

--

Erik Quanstrom          
St. Olaf College        
Northfield MN, 55057.5 USA                      

 
 
 

Buffered file reads, and speeds of C prog vs. wc vs. awk vs. perl

Post by Bill Campbe » Thu, 02 Apr 1992 10:02:37



Quote:>So here are numbers:
>1. C program, reading a file and merely counting lines, with default IO
>        5.7u 2.1s 0:12 63% 0+144k 1494+1io 1494pf+0w
>2. #1, with 10 Meg of buffer (through setvbuf)
>        6.1u 2.9s 0:12 74% 0+9008k 629+0io 632pf+0w
>3. gawk 'END {print NR}'
>        3.6u 2.3s 0:07 74% 0+496k 561+0io 572pf+0w
>4. echo 'END {print NF}' | a2p > a.perl; chmod +x a.perl; a.perl
>        9.2u 10.3s 02 86% 0+760k 76+1io 122pf+0w
>5. wc -l
>        20.9u 2.0s 0:25 88% 0+280k 589+0io 591pf+0w
>wc is a joke; gawk is great; perl is worse than awk; the default IO
>of the C program is not improved by a 10 Meg buffer.

My results are a bit different, possible because I'm using an
equivalent perl program, not the output of a2p which is hardly
optimal.  The first machine is an Intel 303, 386-33 running SCO Xenix
2.3.4 with less than optimal free space on the hard disk and it's
been a REAL LONG TIME since I've done anything about the disk
fragmentation.  The second timings are on a new Intel 403/E
486-33 EISA system running SCO UNIX 3.2v4.  The buffered C
program wasn't significantly faster than unbuffered.

+ ls -l files.all
-rw-r--r--   1 root     root     4074960 Mar 31 01:44 files.all
+ time awk END {print NR} files.all
+ time perl -e while(<>){}; print "$.\n" files.all
+ time wc -l files.all
+ gcc -O tmp.c; time ./a.out

             Intel 303 386-33 Xenix 2.3.4
          time   awk    perl   wc -l   gcc -O
          ___________________________________
          real   91.1   86.6   141.6    83.1
          user   30.7    5.9    12.3     1.9
          sys     4.6    3.7     4.4     3.8

            Intel 403 486-33 SCO UNIX 3.2v4
          time   awk    perl   wc -l   gcc -O
          ___________________________________
          real   15.4   11.4    11.4     8.9
          user    7.7    6.3     5.7     2.5
          sys     3.1    2.9     2.7     2.9

In this timing, perl is significantly faster than awk on the ISA
bus machine, but they're pretty even on the EISA.  I have decided
that I want to sell my ISA system and replace with EISA :-).

Bill
--

UUCP:   ...!thebes!camco!bill   6641 East Mercer Way
             uunet!camco!bill   Mercer Island, WA 98040; (206) 947-5591
SPEED COSTS MONEY -- HOW FAST DO YOU WANT TO GO?

 
 
 

Buffered file reads, and speeds of C prog vs. wc vs. awk vs. perl

Post by Andrew Hu » Thu, 02 Apr 1992 13:57:05



> So here are numbers:

> 1. C program, reading a file and merely counting lines, with default IO
>         5.7u 2.1s 0:12 63% 0+144k 1494+1io 1494pf+0w
> 2. #1, with 10 Meg of buffer (through setvbuf)
>         6.1u 2.9s 0:12 74% 0+9008k 629+0io 632pf+0w
> 3. gawk 'END {print NR}'
>         3.6u 2.3s 0:07 74% 0+496k 561+0io 572pf+0w
> 4. echo 'END {print NF}' | a2p > a.perl; chmod +x a.perl; a.perl
>         9.2u 10.3s 02 86% 0+760k 76+1io 122pf+0w
> 5. wc -l
>         20.9u 2.0s 0:25 88% 0+280k 589+0io 591pf+0w

> wc is a joke; gawk is great; perl is worse than awk; the default IO
> of the C program is not improved by a 10 Meg buffer.

> --



we also had a posting by erik quanstrom from st olaf's (home of mike haertel
and gnu egrep!):

1) system wc: 2.8 real         1.3 user         0.5 sys
2) awk:       1.7 real         1.2 user         0.2 sys
3) perl:      3.3 real         0.7 user         1.0 sys
4) my wc:     1.1 real         0.9 user         0.1 sys
5) my wc2:    0.5 real         0.3 user         0.1 sys

-----------------------------------------------------------------------------

                it is indeed an instructive thought experiment to determine
the quickest way to count lines without measuring. i feel obliged to discuss these
results a little although there is precious little deep thought involved.
I will also ignore the perl numbers; not because of larry wall's taste in
tuxedos(sic?) but because i have no experience with it.

        my initial guess order would be (in quickening order)

1)      sed -n '$=' < file
                slowest because i bet it would package lines internally
                and keep all sorts of details about them.

2)      awk 'END { print NR }' < file
                fair bit faster than sed becuase it is better about I/O
                but still a fair amount of administrative overhead.

3)      C program while(fgets())lines++;
                who knows? stdio normally sucks so badly but it varies.

4)      (regular==bad) wc -l < file
                the regular wc implementation is simple and slow

5)      egrep -c '^' < file
                normally quite good if someone has thought about I/O.
                and i'll bet the gnu egrep has.

6)      (good) wc -l
                i actually mean a good general-purpose wc. (a really clever
                wc would switch between 2 or 3 algorithms depending on what it
                had to do)

7)      C program; big buffers, memchr
                must be the best (unless memchr is broken).

as for erik's programs, i would have guessed somewhere around 6), say 6.2).

so how did we do guessing? i ran this on a file with shorter lines:
pyxis=; wc gre.bug
 189260  189260 4920734 gre.bug

on an SGI but with mostly 10th edition software:

program         usr+sys
---------------------------
awk:            5.2+0.4
sed:            3.4+2.3
fgets(10M):     2.6+1.7
erik.wc:        2.5+1.3
gre:            2.4+0.3
fgets(default): 2.1+1.5
erik.wc2:       1.8+1.0
(sgi's) wc:     1.7+0.3
memchr(8k):     1.5+0.8
erik.wc-O4:     1.5+0.8
memchr(64k):    1.3+0.8
erik.wc2-O4:    1.3+0.8
wc:             1.1+0.4

        well, awk didn't do as well as i expected. erik's programs
are quite sensitive to the optimiser. sgi's stdio seems to be quite fast.
and i am surprised my wc goes faster than memchr! (and please note,
it is figuring out the word and char count too.)

        of course, take these timings with a grain of salt; the error
is probably large -- .1 or .2 probably.

                andrew hume

 
 
 

Buffered file reads, and speeds of C prog vs. wc vs. awk vs. perl

Post by Andy Edwar » Thu, 09 Apr 1992 23:43:54


Hi!

Quote:>Earlier I had asked how to better buffer the reading of a file.
>Many people replied; there is a function in ANSI C named setvbuf
>which hooks a buffer (which you supply) for use with a FILE * you
>specify.

>H & S also say "you almost never need to modify the default buffering".
>It looks like they are right in my (limited) timing tests.

This is not totally true for some combinations of machine speed/
machine load/ operating system/ file cache/ disk cache/ disk access
time/ etc see(#)

Quote:>So here are numbers:

It's best to analyse the numbers that you get from BSD csh time, and
gain some intuition from what you're getting:

Quote:>1. C program, reading a file and merely counting lines, with default IO
>        5.7u 2.1s 0:12 63% 0+144k 1494+1io 1494pf+0w

                                   ^^^^     ^^^^

A lot of I/O & too much paging for my liking here: sounds like an
inefficient C program that reads a load of data into a huge buffer
before processing it (no I haven't read the source yet).

With 128Mb of memory either you didn't test this on an empty machine
or alternatively you have a small working set for the process (ie
limited number of pages allowed resident simultaneously).

Seems too much of a coincidence that the number of reads == the number
of page faults, it looks like every (512 byte) block read it page
faulted to get the memory to store the data read.

Quote:>2. #1, with 10 Meg of buffer (through setvbuf)
>        6.1u 2.9s 0:12 74% 0+9008k 629+0io 632pf+0w

                              ^^^^  ^^^     ^^^

A 10Mb buffer is vastly excessive, you will gain nothing on a demand
paged operating system - don't forget both your setvbuf buffer & fgets
s buffers are in pageable areas of memory, the m*is: If there's a
demand, it's paged!

If you run a disk testing program (there was source for one posted on
the net once called "disktest"(#)) you will find that a combination of
the operating system/ on disk cache/ disk access timing/ different
size reads & writes etc will give you some idea at which values your
program tends to give best performance on your machine.

For example one particular brand of hard disk on a Macintosh IIfx with
System 7 performs with a higher data rate when constantly reading
buffers of around 250Kb than it does with 160kb or 80Kb.

Not only that: taking up 10Mb of store is not useful for the other
people using the same machine, they'll be paging too!

I wrote a program once on the Mac as an MPW Tool & it had a file
buffer set by me. The tool ran quickly, but when I ported the same
code to unix it paged like hell and ran as slow as an old dog.

So I #ifdef'ed the reading functions back to old fgetc to debug it &
work out what was happening. The result: it ran faster without the
extra buffer space. The code has stayed exactly the same since then.

Quote:>3. gawk 'END {print NR}'
>        3.6u 2.3s 0:07 74% 0+496k 561+0io 572pf+0w
>4. echo 'END {print NF}' | a2p > a.perl; chmod +x a.perl; a.perl
>        9.2u 10.3s 02 86% 0+760k 76+1io 122pf+0w
>5. wc -l
>        20.9u 2.0s 0:25 88% 0+280k 589+0io 591pf+0w

>wc is a joke; gawk is great; perl is worse than awk; the default IO
>of the C program is not improved by a 10 Meg buffer.

'scuse me, but whats the important factor here? time taken to do the
job? speed of reading? I ask because you say wc is a joke & gawk is
great: they both have the same i/o & page fault counts neither are an
improvement in terms of machine performance, gawk is only better
because it takes less time.

perl looks good: less i/o, less page faults, but don't be deluded, the
file that you read in may still be partly cached in the unix file
buffering system. Unfortunately it's got the heaviest system time
which is strange!

There are trade offs involved here, you can get them from (#) with
careful programming.

Examining the code: fgets is not a beautiful function anyway, it
probably calls fgetc for each character, stores it & breaks out of the
loop on eof or \n. It's really transferring data from one area of
memory to another (eg the setvbuf buffer to the fgets s buffer) -- I'd
guess this is the source of all the page faults.

Quote:>Compiler/flags for C program didn't matter much.  I tried /bin/cc -O4
>and gcc v1.37 -O.

There's hardly anything to optimise in your program. And I would
expect both compilers to generate roughly the same instructions, so
it's the buffering algorithm that is the problem.

Quote:>Info about the data file: it was 180,000 lines, roughly 18 Meg.

With this amount of data wouldn't you be better off with some database
system & structure the data?

>This computer:

>OS/MP 4.1A.1 Export(S5GENERIC/root)#0: Mon Sep 30 16:11:19 1991
>System type is Solbourne Series5e/900 with 128 MB of memory.

>The C program:

>#include <stdio.h>

>int main(argc, argv)
>     int argc;
>     char **argv;
>{
>    int bufsize, lines=0;
>    char *buffer, *s;
>    FILE *f;

>    if (argc != 2) {
>        fprintf(stderr, "Usage: %s buffersize\n", argv[0]);
>        fprintf(stderr, "buffersize = 0 ==> setvbuf() is NOT called.\n");
>        return 1;
>    }
>    bufsize = atoi(argv[1]);
>    buffer = (char *) malloc(bufsize*sizeof(char));
>    f = fopen("bigfile", "r");
>    if (bufsize != 0) setvbuf(f, buffer, _IOFBF, bufsize);

>#define MAXCHARS 65536
>    s = (char *) malloc(MAXCHARS*sizeof(char));
>    while (NULL != fgets(s, MAXCHARS, f))
>      ++lines;
>    printf("%d\n", lines);
>    return 0;
>}
>--



Happy hacking,

+--: Andy Edwards :----------*=================*------------------------------+


| level 2 porter and PAP     | Barrington      | applelink: harlequin         |
| server clone originator.   | Cambridge       |     voice: 0223 872522       |
|   *Life is strange, yeah   | CB2 5RG         |            +44-223-872-522   |
|     compared to what?*     | England         |       fax: 0223 872519       |
+----------------------------*=================*------------------------------+

 
 
 

Buffered file reads, and speeds of C prog vs. wc vs. awk vs. perl

Post by Larry Wa » Fri, 10 Apr 1992 02:19:45


: I would ignore the perl numbers in this thread because they're based on
: a2p-written code, not actual perl code.
:
: Try this instead:
:
: perl -e 'while(<>) {}; print "$.\n";'

Well, that's actually what a2p spits out, to the first approximation...

: Or would you test C with code written in FORTRAN and run through f2c?
:
: (An unscientific test on a SPARC ELC, SunOS 4.1.2, perl 4.0.19:
: > time perl -e 'while(<>) {}; print "$.\n";' < /etc/termcap > /dev/null
: 0.150u 0.370s 0:00.48 108.3% 0+238k 0+0io 0pf+0w
: > time awk 'END {print NR}' < /etc/termcap > /dev/null
: 0.200u 0.070s 0:00.27 100.0% 0+122k 3+0io 2pf+0w
: )

Note that there's a perturbing factor here (in more ways than one).
Where's all that system time going for the perl test?  It turns out
that it's mostly calls to signal routines, caused by supersafe
implementations of setjmp()/longjmp(), used to enter and exit the null
block.  Ick.  We're also probably getting hammered by saving and
restoring larger register windows.  Life used to be simpler...

To get around the longjmp() overhead (until I s*those calls), you
can say

        perl -le '1 while <>; print $.;'

That'll run faster than sed.  But to get on towards egrep speeds, you
want something like

        perl -le '$sum += tr/\n// while sysread(STDIN,$_,8192); print $sum'

Larry Wall

 
 
 

Buffered file reads, and speeds of C prog vs. wc vs. awk vs. perl

Post by David Pott » Sun, 26 Apr 1992 04:26:10


I use the program talk.  It divides the window into an upper and lower
half. I type in one half, and the other party type in the other half.
We observe the custom of using short sequences to coordinate typing,
e.g., ga means go ahead, but you don't have to do that.  Both parties
can type in their half of the window at the same time--it is no more
difficult than both talking at the same time.

David

--
*                tuktusiuriagatigitqingnapin'ngitkiptin'nga, David Potter

 
 
 

Buffered file reads, and speeds of C prog vs. wc vs. awk vs. perl

Post by Forest Edward Wilkins » Fri, 15 May 1992 16:32:23



>I use the program talk.  It divides the window into an upper and lower

We use ytalk.  It's compatible with several versions of talk, with some
extra features, such as the ability to talk to more than one person at
once.
 
 
 

Buffered file reads, and speeds of C prog vs. wc vs. awk vs. perl

Post by Clark L. Colem » Sat, 16 May 1992 04:41:56



[worthless drivel deleted]

Please learn to post to the appropriate newsgroups. Chat programs have
absolutely nothing to do with comp.benchmarks. Learn to edit followup
lines, etc.

--
-----------------------------------------------------------------------------
"It is seldom that any liberty is lost all at once." David Hume

 
 
 

Buffered file reads, and speeds of C prog vs. wc vs. awk vs. perl

Post by Jeff Chuli » Wed, 20 May 1992 20:18:26



> We use ytalk.  It's compatible with several versions of talk, with some
> extra features, such as the ability to talk to more than one person at
> once.

Is ytalk a freeware program?  If so do you know where I can get
a copy of the chat program?  Thank you.

Jeff Chulick
--
Radiation Treatment --
   "The same power that burned Hiroshima causing three-legged babies and
    death shrunk to the size of nickel to help him regain his breath"

 
 
 

Buffered file reads, and speeds of C prog vs. wc vs. awk vs. perl

Post by Jonathan I. Kame » Thu, 21 May 1992 02:47:46



|> Is ytalk a freeware program?  If so do you know where I can get
|> a copy of the chat program?  Thank you.

The author said in an earlier posting that is available in
/pub/ytalk/ytalk.shar on:

        bongo.cc.utexas.edu (128.83.186.13)
        ix1.cc.utexas.edu   (128.83.1.21)
        ix2.cc.utexas.edu   (128.83.1.29)

(What does this have to do with comp.lang.c or comp.benchmarks, or for that
matter comp.unix.programmer?  I have cross-posted and directed followups to
comp.sources.d.)

--

MIT Information Systems/Athena              Moderator, news.answers
    (Send correspondence related to the news.answers newsgroup
        {and ONLY correspondence related to the newsgroup}

 
 
 

1. Linux vs OS2 vs NT vs Win95 vs Multics vs PDP11 vs BSD geeks

        Every machine and operating system has got its useful
purpose...

        I see no point in argueing with people which OS is better, and
which is worse, and what will survive and what wont...

        The bottom line is obviously the best OS is the one that make
the end user most productive.    Ive used quite a variety of software
from intel, ibm, MS, sun, GNU, DEC/compaq, etc,   and everything OS
has got its UPz and DOWnz, so depending on what you want to do with it
yer machine, probably determines what OS you run.

        So lets cut to the chase -  OS bashing is a waste of time,
and most of the time I'd say the person putting it down just hasn't
seen that particular OS's potential,  or should I say speciality....

      Hell,  Plan 9 has even got some interesting features.. <snicker>

       And all PC users know,  that no matter what use on a day to day
basis on the PC, that one day you will need to boot good ole ancient
DOS to do something...

2. Support for older S3 chipset boards.

3. Perfomance: tar vs ftp vs rsync vs cp vs ?

4. NIS setup on linux client

5. Slackware vs SuSE vs Debian vs Redhat vs ....

6. Linux webmail options.

7. DOS vs. Windows vs. Mac vs. Unix vs. NS

8. need Hardware advice

9. KDE vs. Openlook vs. Xfree86 vs. MetroX vs. CDE

10. Redhat vs Debian vs Yggdrasil vs Caldera vs ...

11. speed - read() vs lseek() vs mmap()

12. mail servers and sendmail -- disk space vs. speed vs. file table trade-offs

13. BSD vs S5 vs MACH vs OSF/1 (no religion, please!)