On Fri, 4 Dec 1998, Peter Rottengatter wrote:
>On Thu, 3 Dec 1998, Ronald Andersson wrote:
>
----- snip ----- re: "Freud's phenomenon"
>
>We use that term when someone messes something up in a way that clearly
>reveals what the person was thinking (instead of thinking how to spell
>right, here). It happens to everybody, and it's fun to realize when it
>happened. ;-)

Yep.  I'm naturally familiar with this sort of phenomenon too, since it
does happen to everybody.  I just wasn't familiar with the expression.


----- snip ----- re: TCP speed problems ('Nagle' code ?)
>
>I'm not as sure as you are about the connection to the Nagle code,
>although it looks indeed likely. We'll see soon.

I hope so.  This problem has been with us for too long already...

Btw:
This subjects reminds me of some RFCs I found a while ago and intended to
mention in the list as worthy of study in relation to TCP efficiency. They
were written in late 1996 through early 1997 which makes them a lot more
relevant to modern network environments than the original TCP RFCs.

The ones I found are:
-----
RFC 2001 on:    TCP Slow Start, Congestion Avoidance,
             Fast Retransmit, and Fast Recovery Algorithms
Category: Standards Track                                   January 1997
-----
RFC 2018 on:    TCP Selective Acknowledgment Options
Category: Standards Track                                October 1996
-----
RFC 2140 on:    TCP Control Block Interdependence
Category: Informational                                       April 1997
-----

These all describe new methods to raise TCP efficiency beyond that which
the original specification will allow in a normal modern network.  This
means we should not aim to incorporate these things at once into TCP.STX,
since we need to make a new release sooner than that would allow us to do.
But they are interesting reading, and eventually we may want to add these
features into the TCP.STX, once our present problems have been solved.


>> 3:  The TCP_ack_wait bug.  This needs further testing and verification,
>>     though it does appear to fail for most servers.  This means that it
>>     either fails to test the variables that keep track of outgoing data
>>     properly, or that those variables themselves are incorrectly handled,
>>     so that data can appear to be ACKed when that is not true.
>
>I believe the last alternative isn't really one. It had repeatedly caused
>corrupted and/or missing data during a connection, which we did not
>observe up to now ?

I agree, but the possibility had to be mentioned.  Actually we have managed
throughout the existence of both STiK and STinG without suffering much from
some bugs which have always been implemented (our 'legacy' from STiK). And
it is only now that we are frequently using servers under STinG that those
bugs are becoming an obvious nuisance.  They were mostly hidden previously
by the fact that traditional applications for STiK always worked as pure
clients.  The only real exception was FTP clients, when uploading, so it
is not surprising that this is one of the things STiK/STinG users have
always had some problems with.  Here too these bugs were revealed, although
never completely investigated.


>> 4:  The SYN bug.  Apparently sending of data by a server too early after
>>     an incoming call, so that the outgoing data is sent in the same packet
>>     as the SYN flag, can cause the first byte of that packet to be lost.
>
>Huh ? Have I missed something ? Haven't heard of this one before.

If I remove the initial delay between detecting connection to a server and
the sending of that server's initial response, then this bug will strike
in very nearly 100% of all tests.  I made some tests on this today...

LogSTinG inspection of transmitted packets shows a twofold error:

1:  The first byte of the data passed to TCP_send will not be present in
    the transmitted packet data block, which instead starts with the second
    byte of the original data.

2:  The transmitted packet data block will contain two garbage bytes at the
    end (after the last byte passed to TCP send).  This means that the data
    length is incorrect.  One garbage byte would be expected if the correct
    length was used (even with incorrect starting address), but two of them
    means that the length used was one more than the length of data passed
    to TCP_send.

This was observed in packets sent with ACK and PUSH as the only ones of the
6 'special' bitflags in the TCP header to be set, which surprised me, since
it means that the packet did NOT carry the SYN bit.

The SYN bit could cause miscalculation of data position as well as of data
length, since it takes one byte position in the logical sequence, but not
in the physical storage.  But in this case the SYN bit was not part of the
packet after all, which raises the question:

Does some part of the TCP code mistakenly react as if SYN was present even
for some packets that do not have it...?  This could explain the bug.

(Here follows some rather 'heavy' speculations...)

Also, is it possible that the special treatment of SYN for the preceding
packet was missed due to 'early' TCP_send, so that the succeeding packet
was sent with misaligned connection variables (send.count, send.ptr, etc) ?

If so, it could mean we have a 'hole' in the state machine, caused by the
timeslice delayed response to some operations that theoretically should
cause immediate state changes, so that an interrupt driven (by TIMER_call)
server can send data in an intermediate and undetermined state...

(The speculations below are based on the older TCP sources familiar to me.)

Eg:  When an incoming server call is detected by the 'TCP_handler' function,
     it will cause the connection data to be completed with the correct
     local IP, remote IP, and remote port.  These changes will then become
     detectable to a server through the CIB pointer, but the state of the
     connection remains TLISTEN at exit from 'TCP_handler', and will remain
     so until the next time that 'timer_function' is activated.  That will
     call 'do_arrive' (via 'timer_work') for the datagram placed in the
     'pending' queue of the connection earlier by 'TCP_handler'.  And then
     'do_arrive' will call 'send_sync' and switch the connection state to
     TSYN_RECV.  This may appear quite correct and even failsafe...

     BUT:

     'timer_function' is a 'TIMER_call' handler, and that is also what we
     must expect any interrupt driven server routine to be as well.  That
     is certainly the case for NetD for example. The running order of the
     'TIMER_call' handlers is not determinable, either for a module or a
     server, so they can't even know what the order is.

     This means that there is a potential 'hole' where a server may react
     to a completed connection, even though the connection state value for
     it is still TLISTEN, which the server can not check.  Any API calls
     made for that connection will then be dealt with as if made in a
     normal TLISTEN state, meaning that they will not function correctly.

     However, NetD does not make any TCP_send calls from its TIMER_call
     handler. Instead it simply moves a ptr to the NetD_CON struct of the
     connection to an 'activation' queue belonging to the NetD server that
     opened this NetD_CON for its services.  This is to avoid wasting too
     much time in interrupt driven code.

     Later, when the NetD APP/ACC calls NetD_exec_APP in NetD STX, this
     will lead to calls to the 'connect' and 'traffic' functions of the
     NetD server that owns the newly completed connection, and that is
     where the TCP_send (etc) operations will be performed.  This means
     that they are executed as part of a normal GEM APP (the NetD APP).

     That means that TCP_send at this point in time should have worked
     correctly, unless we somehow got here by an abnormal task switch.

NB:  Because of the last sentence above, I find it highly likely that this
     speculation does not show the cause of this particular problem, but
     it does indicate a new kind of problem to look out for due to the
     ability of STinG servers and modules to use TIMER_call handlers.
     This will allow them to be activated in the interval between the
     'TCP_handler' operations and the 'timer_function' operations that
     those cause.  The solution is to not allow TCP_handler to leave any
     connection it handles in a connection state that does not match the
     current situation at the time when any TIMER_call handler is called.

     The best, and most generic, way to eliminate all such holes by a
     simple modification, is to modify the existing IP_handler function
     so that it takes another argument in addition to the main handler
     ptr.  That new argument should be a ptr to a handler identical to
     the one used with TIMER_call (or NULL if none is needed). The new
     handler should be added to a new internal queue similar to that
     used for TIMER_call handlers.  Then you modify the 'poll_ports'
     function so as to add one more 'for' loop to call the new timer
     driven functions that are specific for protocol handlers.  This
     new loop should naturally be added directly after the loop in
     which 'process_dgram' is called, since this is what leads to the
     calling of the protocol handlers.

     This change would make it impossible for any TIMER_call handlers
     to 'see' intermediate states between execution of the main handler
     of any installed protocol and the execution of its supporting timer
     driven handler.

     Apart from the kernel changes described above it will only require
     two small changes each in UDP and TCP modules.  You will need to
     add the 'timer_function' pointer argument to the IP_handler calls,
     and remove the calls to TIMER_call.


----- snip ----- re: my need for newer TCP sources
>
>I will do so. People have reported that the speed problem does not strike
>with a certain TCP version as of last year, while the following version
>causes trouble. I have partially isolated the versions and their
>corresponding source code, so that by comparing the source we should be
>able to deduce something.

Mmm, I'm not so sure of that.  I do not think that the problem was ever
completely gone from any version, though some versions had characteristics
that masked the bugs from being observed under common operating conditions.
We need to look for some fundamental method changes to make in how ACKing
packets are optimized for efficiency.  I do have some ideas, which I will
try out as soon as I have a modern TCP source to experiment with.


----- snip ----- re: rearranging timeout loops to allow minimal delay
>
>I indeed believe I have changed it already, but I'll double-check. I'm now
>not as tight on time anymore, so I can commit more time to hacking.
>However there are several things to do that have been delayed due to my
>examn stuff, so still a few more days before devoting most of my free time
>to programming and bug-hunting.

I had expected as much.  Putting normal life 'on hold' for studies usually
does require some catching up afterwards.
