I'm extending a data acquisition app, written in C. It will acquire a
large amount of data (e.g. from a PCI card on an Sparc Ultra5 running
Solaris 9, I think), perhaps 10 or more Gibytes. So, we're into
largefiles. The acquiring process as it stands now is a 32-bit app,
although I _suppose_ that could change if that's required for the
solution. The acquired data need to go two places, sort of like
"tee":
1. The data need to be written to a local disk.
2. The data also need to be sent to another process, running on
another machine. The data could be sent to the other process as its
standard input (popen("rsh othermachine otherprocess"), or somesuch, I
guess). Or it could be written to a named pipe, I suppose. (Do named
pipes work if either the reader or writer is looking at an NFS
filesystem?) It could be some other way of feeding a remote process
that you thought of that I didn't.
The problem is that the data must be collected in a timely fashion,
but the remote consumer might be slow. (On the other hand, the remote
consumer might be fast enough to keep up and indeed be mostly
waiting.) And the amount of data is large, in excess of the RAM in
the machine, but not more than the local disk. So we don't want the
acquiring thread/process to block because it has filled up all
buffers. And the buffers perhaps can't simply be made larger (or can
they?)
Is a reasonable solution to have multiple pthreads within the "tee"
process, one writing the data to disk, and the other reading the data
back from the disk and sending it to the remote consumer? These two
threads would communicate (synchronize) so that the one reading from
disk never tried to get ahead of the one writing to disk. I don't
want the slow one looping waiting for more data (maybe it's not that
slow); I'd rather it somehow block and be alerted when more data are
available.
If I instead went with the sort of obvious solution of NFS mounting
the local file system onto the remote machine, and having the slow
reader running on the remote machine, open the remote file itself,
etc., is there any way for that remote process to know how to wait for
more data to be written to the file? (It will get EOF when more data
might be still coming, right?) Is there any way for it to know when
the writer of that file has called close()?
- Ben Chase
(Any email address above is junk, for spam-avoidance.
".foo")