I've written a FTP server for MS-Windows (works on Win3.1 ... NT), named
Serv-U, with currently several thousand people using it. In general, all
works well, except for a few people with a problem only happening on NT.
Normally I'd assume some peculiar software-software interaction, but in
this case I've seen it happen twice on my own FTP site (about 500 clients
a day). I've used the occasions to log everything, up to the socket
function call level, in an attempt to get some idea what's going on. The
picture is clear, but the cause is not.
What seems to happen is that the listening socket (on port 21), which is
an asynchronous socket set up to send notification messages on connect,
stops sending messages. Incoming clients still connect, because the stack
handles that by itself, but the connection is never passed on to the FTP
server. All other sockets (even those from the same task) seem to work
normally meanwhile).
I run a watchdog timer, which once in 3 minutes does a 'getsockopt()'
function call to see if the listening socket is still listening, and the
return value indicates that as far as the stack is concerned all is well.
I also let it post a fake FD_ACCEPT message (ie. once in 3 minutes), and
when this is going on it invariably finds the connecting client and
handles it as it should, thus indicating the rest of the program and the
message window itself is still functioning normally. The watchdog also
re-sets the listening socket to respond to FD_ACCEPT messages through a
'WSAAsyncSelect()' call, however, once the socket stops responding it will
not go back to sending out messages on connections, despite the async
select call.
I'm also running Borland's CodeGuard info linked into the program, so I'm
sure there are no messed up pointers and the program internals are OK (ie.
it would find error messages on problems like dangling pointers etc, so
I'm fairly sure the program does not get into some unknown state because
of bugs).
That's about the whole story. I'm puzzled by what's happening. There does
not seem to be anything wrong, yet NT messes up the listening socket
somehow. One problem is that I can't detect this from within my program,
so I can't just junk the socket and get a new one.
I'm hoping this rings some bells as to 'known' problems in NT. I'm about
at the end of my rope as far as debugging goes, and all I'm finding is
that all is well within Serv-U, yet the socket goes off the deep end.
The above happens on NT 3.51 SP4. It has never been observed in Win95,
seems to work fine there, and this is with a 32-bit program.
Well, hope the above made some sense, and you have more of a clue as to
what's going on than I have.
Thanks for your help!
Regards,
Rob
-/-
well. I'll try to keep up with reading this newsgroup, but E-mail is
faster.
--------- "Save a plant, eat a vegetarian..." (Rajesh '95) -----------
Rob Beckers is the author of "Serv-U", FTP server for Win3.1, WFW3.11,
Win95 and NT. There are currently well over 2300 registered users, not
counting licenses for multiple copies. You can find more information about
Serv-U at http://CatSoft.dorm.duke.edu
-----------------------------------------------------------------------