Zero-copy TCP

Zero-copy TCP

Post by T. Srikant » Sat, 13 Jun 1998 04:00:00



What exactly is the mechanism involved with zero-copy TCP in Solaris?
I understand it is clever use of some hardware features to avoid copying
between buffers. Can anyone that knows better explain the mechanism?
Which version of Solaris supports the zero-copy TCP?

- Srikanth

 
 
 

Zero-copy TCP

Post by Marc Slemk » Sun, 14 Jun 1998 04:00:00



Quote:>What exactly is the mechanism involved with zero-copy TCP in Solaris?
>I understand it is clever use of some hardware features to avoid copying
>between buffers. Can anyone that knows better explain the mechanism?
>Which version of Solaris supports the zero-copy TCP?

2.6, but only with a limited number of NICs like one of Sun's ATM
cards (OC-12 card I think).

It also supports hardware checksums, which are arguably more useful
and arguably dangerous.

Part of this support has to be support for gathering writes, and
I think the hardware supports receive scatter along with that.

I never was able to get any technical info on what is required to be
sure the kernel will use it.  From what I was told by someone who
checked with someone at Sun, Apache 1.3 should do things that benefit
from it a bit.

If you mmap() a file, have buffers aligned on 16k boundries, and
do a write() of 16k (or possibly some multiple; writev() with one
16k chunk will _not_ do it for that chunk, unfortunately) you should
end up not needing to have the kernel copy the data from the buffer
cache into a mbuf then to the NIC.

 
 
 

Zero-copy TCP

Post by Alan Coopersmi » Sun, 14 Jun 1998 04:00:00



>What exactly is the mechanism involved with zero-copy TCP in Solaris?
>I understand it is clever use of some hardware features to avoid copying
>between buffers. Can anyone that knows better explain the mechanism?

The original research report is available at
        http://www.sunlabs.com/techrep/1995/abstract-39.html
(but I don't know how different the version that made it into
 Solaris 2.6 is)

--
________________________________________________________________________

Univ. of California at Berkeley         http://soar.Berkeley.EDU/~alanc/

 
 
 

Zero-copy TCP

Post by Marc Slemk » Mon, 15 Jun 1998 04:00:00




>>What exactly is the mechanism involved with zero-copy TCP in Solaris?
>>I understand it is clever use of some hardware features to avoid copying
>>between buffers. Can anyone that knows better explain the mechanism?
>The original research report is available at
>    http://www.sunlabs.com/techrep/1995/abstract-39.html
>(but I don't know how different the version that made it into
> Solaris 2.6 is)

That paper does _not_ describe the mechanism implemented in 2.6.
I'm not sure I would call that "the original research report", but
rather simply some prior work from Sun on the general concept.
 
 
 

Zero-copy TCP

Post by T. Srikant » Tue, 16 Jun 1998 04:00:00



> It also supports hardware checksums, which are arguably more useful
> and arguably dangerous.

I think data corruption over the I/O bus will not be caught, if the checksum
calculation is in interface hardware. But, data transfers for disk over the bus
are assumed to be correct and not checked in software. So it is not unreasonable
to have checksums in hardware.

- Srikanth

 
 
 

Zero-copy TCP

Post by Logan Sh » Tue, 16 Jun 1998 04:00:00




>I think data corruption over the I/O bus will not be caught, if the checksum
>calculation is in interface hardware. But, data transfers for disk over the bus
>are assumed to be correct and not checked in software.

...as I learned because the bus on my Amiga 2000 had a bug, and large
DMA transfers to the hard disk sometimes tended to have a byte deleted
at one point, a number (typically 50000 to 100000) of the following
bytes shifted ahead by one byte, and a garbage byte inserted to make up
the difference.

I solved the problem by telling the filesystem not to do DMA transfers
over a certain size, so that the problem (probably) wouldn't occur.

Now that I think about it, I recognize the potential for generating
truly random numbers.  Maybe I should have written a device driver for
my serendipitous random number generator and sold the thing to somebody
looking to generate unguessable keys for use in cryptography...

  - Logan

 
 
 

Zero-copy TCP

Post by Marc Slemk » Fri, 19 Jun 1998 04:00:00




>> It also supports hardware checksums, which are arguably more useful
>> and arguably dangerous.

>I think data corruption over the I/O bus will not be caught, if the checksum
>calculation is in interface hardware. But, data transfers for disk over the bus
>are assumed to be correct and not checked in software. So it is not unreasonable
>to have checksums in hardware.

Yes and no.  It is true that you can introduce corruption between
the CPU or memory and the NIC.  It is true that you can have corruption
over some busses anyway, so you can argue that you aren't introducing
the chance of undetected corruption, just adding another place where
it is possible.  Well, TCP's checksum isn't that strong to begin with,
but..

However, all transfers over a bus are not necessarily equal in the
chances of corruption.  It has been argued that there are more serious
issuse with offloading checksum generation to hardware.

It is also an obviously desirable feature in terms of performance.

I am not qualified to judge the validity of the concerns about it,
but people who know more than I have thought some of them to be valid.
Others have also dismissed them.

So I guess it is just a warning to understand the risks.

 
 
 

1. Possible problem with zero-copy TCP and sendfile()

        Hello,

        I have discovered a possible problem on my host. The short
story is: When downloading ISO images from this host (which
runs 2.4.3 + zerocopy and ProFTPd with sendfile()), the image is
sometimes corrupted (MD5 checksum of the downloaded file does not match).

        The long story: My server is Athlon 850 on ASUS A7V, 256M RAM.
Seven IDE discs, one SCSI disc. The controllers and NIC are as follows
(output of lspci):

00:04.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)
00:0a.0 SCSI storage controller: Adaptec AIC-7881U
00:0c.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev 74)
00:11.0 Unknown mass storage controller: Promise Technology, Inc.: Unknown device 0d30 (rev 02)

        The server runs Linux 2.4.3 with zero-copy patches and ProFTPd
1.2.2rc1 compiled with --enable-sendfile.

        The FTP area is on RAID-1 volume, which is created over two LVM
partitions (each LV spans three physical disks). I hope RAID-1 can speed
things up for multiple simultaneous users.

        Yesterday the Red Hat Linux 7.1 has been released, and from that
time the server has about 220 anonymous FTP users and was pushing data
at almost full 100 Mbps ethernet speed (currently the 2hour average is
89.7 Mbps according to MRTG). Today I've got about three complains
about corrupted ISO images. When I run md5sum on the server itself,
the MD5 checksums, of course, perfectly match. I've tried to download
the files from another machine on the same net, and MD5 sums were correct.
However, I have one report of corrupted download even from the same physical
network.

        In the last 24 hours the server pushed out about 660 gigabytes
of Red Hat 7.1. Is this amount (i.e. three reports out of 660 gigabytes)
a serious problem?

        Also note that I have no corrupted download report for rsync.
But I think rsyncd does not use sendfile(), and of course vast majority
of people use FTP, not rsync, for downloading.

-Yenya

--
\ Jan "Yenya" Kasprzak <kas at fi.muni.cz>       http://www.fi.muni.cz/~kas/
\\ PGP: finger kas at aisa.fi.muni.cz   0D99A7FB206605D7 8B35FCDE05B18A5E //
\\\             Czech Linux Homepage:  http://www.linux.cz/              ///
Mantra: "everything is a stream of bytes". Repeat until enlightened. --Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2. Mandrake's dedicated kernel?

3. zero copy TCP

4. 2.5.32-usb

5. Problem printing from Solaris to LPD printer server.

6. TCP Zero Copy for mmapped files

7. dvi to ps converter

8. problem with TCP zero copy (sendpage)

9. Make NFS/RPC client use the TCP zero copy API when hardware supports it

10. FreeBSD zero-copy socket patch

11. [cft] zero-copy dma cd writing and ripping

12. Zero-copy IO