Mail To: Peter Rottengatter <stik-beta-testers@list.zetnet.co.uk>
Subject: Unblocking STinG API methods
----------------------------------------------------------------------------

NB: The subject used to be "Re: GlueSTiK, FTP..." etc

Hi Peter and Vassilis (and anyone else interested in this),

It's just me, 'butting in' again to add a few comments and ideas.
This time I'll cut a lot of stuff without comment, because I feel
those things have been dealt with in other mail already.

I'm also glad to note that the 'temperature' of the discussion seems
to be getting back in control again, though it was still rising in
the mail I'm now replying to.

Spirited discussion is all very well, and very necessary, but we need
to control our tempers.  Especially as nearly all things that give rise
to 'flames' are usually based on innocent misunderstandings.


On Mon, 23 Nov 1998, Peter Rottengatter wrote:
>On Sun, 22 Nov 1998, Vassilis Papathanassiou wrote:
>
----- snip ----- re: the /dev/sting proposal

I agree with Peter that we do not currently need a new 'ownership' system
based on a more 'MagiC-like' pid.  I also agree with Vassilis that such a
system could be useful in creating a more 'MagiC-like' environment under
non-MagiC systems, but I do not think it can be done well for a reasonable
investment of work.

For the problems we do have now, that Vassilis wanted the new pid stuff for,
I believe that the alternative means I described in other mail will suffice.
(This refers to my suggestions for non-blocking TCP_close and Resolve calls)


----- snip ----- re: div stuff

----- snip ----- re: connection 'ownership' issues
>
Peter wrote:
>
>Still, I haven't said that I would ignore ownership issues once and
>forever. And I haven't said either that I intent to deny access on the
>base of ownership, which you seem to imply further down in your post.

I'd like to add that if ownership was determined by pid of active APP, and
used to deny access, then implementations like NetD would be impossible,
and even BNET would have problems using connections opened when one APP was
active at a time when taskswitching has passed control to some other APP.

In general I think such denial would make many types of 'smart' servers
impossible, including all those who would access connections through API
calls made from interrupt driven code.


----- snip ----- re: div stuff

----- snip ----- re: pid methods for non-blocking API modifications
>
Peter wrote:
>
>So you think it won't work without knowledge of act_pd ? I think that's
>the second thing to think about, first we need to discuss a generic way
>of signaling the application. Having done that, we might miraculously
>find we do not need the act_pd at all (which is why I called it inde-
>pendent).

I think you are wrong in the first statement. Thinking of whether it would
work without 'act_pd' info has led me to a solution for how to signal APPs
about state changes for a connection, which does not require any 'act_pd'
or other method of direct ownership knowledge.  So I do think you are
correct in your third statement after all.

This method is an extension of the ideas used for my previous suggestions
of non-blocking modifications for Resolve and TCP_close, and would also
replace that earlier suggestion for TCP_close.  The basic idea is that
each call to UDP_open or TCP_open, or to new similar functions, should have
a new argument in addition to the already defined ones.  The new argument
should be a pointer to a long variable owned by the calling 'opener'.

Note that we do not have to care about who or what that owner is, nor how
the ownership arose.  All we need to know is that the variable *is* owned,
through irrelevant means, by some code that wants a connection opened.

This variable's address should then be stored, by the UDP and TCP modules
(since this method can apply to both UDP and TCP) in a new structure element
of their connection structures.  Its meaning to all connection-oriented
API calls would be that the connection should be handled in a fully non-
blocking manner.

I will call the variable 'CONN_result' below, to have a brief way of
referring to it. So where I say things like 'store E_ERROR in CONN_result'
I will actually mean that the place where E_ERROR is stored is found by
fetching the pointer from the new connection structure element.

Connections opened in the old ways should have the pointer to CONN_result
set to NULL by UDP_open or TCP_open, so that all API functions can know
that for these connections the old blocking methods must be used.

In unblocked mode, whenever any connection-oriented API call can not
return immediately with a final result, it should instead return with a
new error code, E_UNFINISHED, which is reserved for connections opened
in the new unblocking mode, and will thus never be used with old clients
that open connections in the old way. The same code should also be stored
in CONN_result, and that should then remain unaltered until an interrupt
driven routine completes the 'unfinished' function, and stores the final
result code in CONN_result.  For the case of UDP_close or TCP_close, the
routine will thereafter release all the data structures of the connection.
They are no longer needed, because the 'opener' routine knows the variable
it used to open the connection, and will be looking there for the result.

But now you are probably asking how that interrupt routine knew what work
was needed ?  (I would !)

The answer is not very difficult.  Each connection structure needs another
new element, which in fact should be a substructure declared as the 'union'
of the structs implied by the argument declarations of all the connection-
oriented API functions plus one uint16 used for the API opcode and another
uint16 that may be used to represent various phases of progress for an op.
The 'unfinished' API functions should copy their arguments into this struct
and link the connection into a new special queue for connections that need
interrupt driven completion of API ops. Thereafter the API function should
return to the caller with the E_UNFINISHED result as described above.

Each protocol handler (TCP and UDP) already has a timer interrupt driven
routine that handles various reception and transmission tasks by checking
connection structures linked to various queues.  Their work should now be
extended by also checking the queue for connections with unfinished API
ops, as described above, and calling internal completion functions that
should be defined for each such API function.  Some of these may need
to have their work done in several 'phases', which is why I mentioned
above that an extra uint16 should complement the API opcode to indicate
completion phase progress.

Whenever a completion function is called, but still can't complete its op
immediately, it should return E_UNFINISHED to the caller (timer handler),
possibly after updating the phase progress value (if used).  The timer
handler will check the return value, and if it is E_UNFINISHED will simply
proceed to process the next entry in the queue, until the end of the queue
is reached.

When a completion call does manage to complete its op however, then it
should return with the true return value of the function.  The timer
handler will then copy that value to CONN_result, unlink the connection
from the queue, and then proceed as above for the rest of the queue.

How to use the API with above alterations from a multi-connective APP:
----------------------------------------------------------------------

Suppose we want up to 16 simultaneous connections in an APP. We can
then define two arrays, "int handles[16];" and "long results[16];"
used to store handles and API result codes for each connection.
These are then used in various ways as described in sections below.

Multi-connective opening:
-------------------------
I'd use an opening subfunction that searches the 'handles' array for a free
entry, and uses the index of that entry to access both arrays.  It then
attempts to open the connection, using "&results[index]" for the CONN_result
pointer argument in the UDP_open or TCP_open call used.  If there is any
error returned from this, the subfunction immediately returns that error
code to the caller, but otherwise it stores the handle returned in the
array element "handles[index], and then returns the index (not the handle)
to the caller.  All access functions inside the APP should use that index
to refer to the connection, since it gives access both to the handle and
the CONN_result variable, in the arrays.

Multi connective application loops:
-----------------------------------
Using the schemes described in this mail it should be obvious that APPs
can never be blocked by any STinG operations, and are therefore free to
update screen, service menus, etc. at all times (even in singletasking).
PROVIDED that they are coded in such a way as to 'remember' what ongoing
STinG transfer tasks are in progress, so they scan the CONN_result values
for completion info in all application loops, and remember how to proceed
when specific completion values arrive.

NB:  This need for careful client coding is in no way particular to this
     implementation or to STinG.  It is in fact a prime requirement for
     any unblocking communication program, and especially multi-connective
     ones, regardless of the 'signaling' methods.

Much of that 'remembering' can be implemented by the aid of arrays similar
to 'handles[16]' and 'results[16]' as defined above, but defined to contain
'state' constants defined by the APP to represent the various activities
which the connections associated with the index values are engaged in.

Many other ways are possible, using queues and special structs instead of
arrays, and of course the choice of which to use is up to each author.

I suggest that all traffic handling functions that scan the arrays, queues,
or whatever, should always start with that corresponding to 'handles[16]'
above, so as to avoid the risk of wasting time on entrys that don't represent
currently open connections.  One way to do that automatically is to use
an array of structs instead, the structs starting with a 'handle' element,
which would be the first checked in all when scanning an array element to
see if it represents a connection needing some more work.

Multi-connective closing:
-------------------------
This differs from other non-opening Multi-connective operations only in that
when the final result code arrives in CONN_result of a connection, then the
entry in the 'handles[16]' array should be erased, and the index should no
longer be considered to refer to an open connection.

More alternatives:
------------------
Like I said above, many alternatives to these methods exist, and one very
good one is to define an array of structs and 2 queue root pointers to such
structs.  One entry of each struct should be a link ptr to the next struct
in the queue, another entry should be the connection handle, and another
should be the CONN_result variable. At init one of the queue roots should
be initialized to point to the base of the array, and then each struct in
the array should be linked to the next one, except for the last which should
have a NULL link.  The second queue root should be initialized to NULL.

The first root could be called 'free_CN_queue' and the second 'used_CN_queue'
and other queues can also be meaningfully used.  The APP can easily remember
ongoing traffic simply by moving entries between different queues and also
(for some cases) by modifying extra elements of these structs.

I describe this method last only because I know that some consider these
methods complex and hard to understand, whereas I consider them to be the
most efficient and programmatically 'elegant'.


----- snip ----- re: no comment
>
>I agree that nobody likes monolithic servers. Not "even" Ronald I'm sure,
>as you imply here. Apart from this, Ronald's "love" for single-TOS only
>stems from a realistic estimation on how far and widespread single-TOS
>still is in use.

Actually there is one other factor.  SingleTOS is the original after which
all other TOS-like systems are partially modeled, so what works there will
usually work in any of the others that has achieved decent compatibility.

This makes singleTOS an invaluable testing environment, especially as many
debugging tools work much better there, giving features not available in
any other environment.  (eg: safe AMON debugging of ACCs etc)


>I love MagiC too, but I won't approve either anything
>that would limit STinG's usage to MagiC.

Good.  We should always choose compatibility rather than dependency.


>> But proposing something that will allways have a wall in front of it called
>> 'SingleTOS' will not lead us anywhere.

I do not believe in 'impenetrable' walls unless I've butted my head against
them and noted that they really are (even if that does hurt at times...  ;-)

As you will note in this mail, and some other recent ones I've posted,
I've managed to butt quite a few holes into those 'walls' already...
It remains to be seen whether these ideas will be adopted for STinG,
but I do not think that anyone can refute that they open possibilities
for singleTOS-compatible changes of the kind we all want.
