To: stik@ON-Luebeck.DE
Subject: Re: [3] STIK: Work on STiK                                                      


On Sat, 28 Dec 1996 02:04:27 Martin Mitchell <mem@mode wrote: 
>
>I know you can use the latter 128 bytes in the basepage, but it is quite a
>small amount to use for some special purpose, and I think it is unusual to
>use it. Hence I did not include this possibility in my minimum overhead
>calculation.

Agreed, but when we are talking about the need to size-optimize code,
then such 'unusual' methods should definitely be included.  Actually
it is not really all that unusual, since the normal Pexec modes set
this area up as a 'default' DTA area for use with Fsfirst & Fsnext.


>Actually I didn't miss them, I just chose not to include them. You might I
>labelled my table "_minimum_ overhead.." So the actual cost would still be
>higher than 256 bytes or 0 bytes per module. True, I agree you can ignore
>the Mshrink in modules if they Ptermres almost immediately. I agree also
>that half of each basepage may be used, and the amount is practically
>negligible, however I think efficient programming is desirable for
>machines with only 1Mb, for example.

Agreed, but there are two important things to consider here.  First, 128
bytes per module is not all that much.  Second, it is incorrect to think
of those 128 bytes as 'lost for no purpose'.  Their purpose is important.
The investment of these 128 bytes gains us the support of the Pexec loader
and relocator, and also ensures that each module starts as a proper gemdos
program that can legally access all gemdos functions.


>I shouldn't really bother to care, actually. Both my main machines have at
>least 8Mb RAM. However I do have a 1Mb STFM too, and I would like to think
>that STiK is not so bloated that it can run usefully on the STFM.

I assume you intended to say "can't run" in the sentence above.

I definitely agree on this, but I don't think the Pexec method of module
loading will lead to that situation.  It is in this that we differ.


>Since my STFM is TOS 1.00, I also need to consider all the patch programs
>necessary to make it usable.

Not really...  What you need is a new ROM/EPROM set.

TOS 2.06 is naturally the best, but is not compatible to some old programs,
and unfortunately requires an extra board for installation in an STFM.

TOS 1.04 is the one which introduced most of the important modernizations
of GEMDOS, and is compatible to all software except such that used some
undocumented (usually bugged) feature of older TOS.

Some prefer to use double-sized EPROMS allowing you to switch between TOS
1.04 and TOS 2.06, but in practice this is seldom used.  Once someone has
gotten used to the convenience of TOS 2.06, he won't voluntarily use 1.04

If you don't want to invest in any new boards for the STFM, you should at
least change to TOS 1.04, which only requires replacement of the TOS chips.


>> The actual time taken is mainly dependent on the file and folder structure
>> leading to the 'STIK_DIR', and the number of files in that folder.
>> Assuming that all modules have the 'fastload' bit set there should not be
>> a very large difference between reading 'data' modules or using Pexec.
>> I've tried his demo code with a lot more modules than 4, and the load time
>> was quite tolerable.
>
>That sounds fine, however what if system memory is cleared for Pexec()? It
>might slow things down considerably, since TOS 1.00 doesn't use fastload.

That is one of the (very very many) serious bugs in that TOS, for which there
is a patch utility available.  This is named "PINHEDxx.PRG", where "xx" is
replaced by the version code (I believe the latest was "18" == version 1.8).
So this issue is not our problem.


>> Yes I do, and have myself argued similarly with Peter.  I do not really
>> think that this issue matters however, and using some non-program format
>> would have several drawbacks.  eg: We would have to do a lot more work on
>> the dev-kits to ensure that all compilers can produce the special format.
>
>The dev-kits? Well, this is the most compelling reason so far for your
>Pexec() method. However I would say:
>
>- It is not really difficult to link custom startup code in any decent C
>compiler.

It should not be, but creating such code requires both knowledge of that
particular compiler, as well as of what the new code must contain.  The
latter means that one of the STiK programmers must do it, and this can
only be done when one of us is familiar with the requirements of the
compiler in question.


>- You have just reminded me.. using the standard C startup code from any C
>compiler will probably be at least 256 bytes per module! So that takes the
>minimum space estimate per module to 512 bytes/module! Potentially
>having 2kb overhead for 4 modules is far too much I think.

True, but Peter (and I) intend the main modules to be in optimized assembler,
and the minimum-RAM users are not likely to be using any others.  Besides,
your argument is equally valid against all C programming for these machines.
Of course this does not mean that C programming should be forbidden, but it
does mean that where space and speed is critical we should encourage the use
of human-optimized assembler.  No compiler can ever beat that.

Even so, to enable more programmers to contribute to STiK, we must and will
also allow modules to be developed in C, or any other high level language.
And this is where the Pexec method really shines, since all languages that
produce normal Atari executables are already compatible to it.


>> Also, we would lose the use of 'FastRAM' and MiNT 'protection' flags in
>> the program headers.
>
>This is no loss. STiK should be set for TT-RAM and any DMA buffers can be
>Mxalloced.
>With a custom module system, all the STiK modules are part of STiK and will
>inherit the flags of STiK.

Precisely, which means that they must all use identical flags.
If that had always been ideal, the flags would not exist.


>> I agree that he overreacted to your single reference to OOP, but please note
>> that although C coded modules are allowed, no part of the kernel itself will
>> be entrusted to the care of any C compiler.  This program is being written
>> completely in human-optimized assembler for maximum speed optimization.
>
>This is good news. And having a custom module system and IP as a module
>should simplify such human optimization, due to closer integration and
>better interfaces between modules and the kernel. Also it keeps the size
>of the code segments at a more practical level for hand optimization.

I do not really see how having IP as a module can give closer integration
than when it is directly integrated as part of the same assembly program.
Any interface possible for a custom module system can also be implemented
for Pexec modules, only the initialization needs to be different.

The point about code size is of course valid, but hardly crucial.  I am quite
certain that we can handle it with IP included,  which will also allow some
optimizations not possible if IP must be loaded through a general module
loader (regardless of custom/Pexec methods used).


>> >That's right, but it's equally valid for me to say: 'Why should IP be
>> >included along with all the miscellaneous functions?'
>> 
>> Because they include an interrupt driven threader which may interact with
>> some IP routines quite a lot.  If IP was written as a separate C module,
>> that would require a much larger function call overhead, than when it is
>> done in optimized assembler integrated with the calling (and called)
>> routines.
>
>I think the time critical parts of IP should be hand optimized too.
>Presumably both Peter and yourself are concerned with the time taken to
>receive data/route/defragment/depack IP... however as you shall be calling
>routines even in a STiK kernel IP, can I convince you that using a module
>really doesn't change the timing, at the assembler level?
>
>eg:	jsr	route
>
>can still be the same even with a module system. The module just has a
>tiny relocation/patch routine to change the addresses for such time
>critical subroutines to functions within the module. Parameters
>are still passed the same way. (in registers I hope!) Hence there is
>no real loss of speed here.

In principle you are correct, this could be done, but it would also mean
that this particular module is not a general module.  STiK would have to
set up tables of pointers exclusively for the use of this one module.
It, similarily, would also have to pass a unique set of pointers to STiK
in a manner, and for purposes quite different from any other module.

In short, its interface would not follow the modular concept of STXs,
and therefore it should not be an STX module.


>> But making them separate from each other, each network-specific protocol
>> integrated with its own interrupt routines and module handler, will make
>> both networks faster.  Especially in the normal situation, when only one
>> network type will be installed.
>
>While I accept most of what you say about STiK and possible implementation
>of other network types, I don't really think it impacts on the speed as
>much as you might think. See above!

Given that particular method of implementation you are correct.  It would
be possible to produce code this way with almost the same speed as with the
fully integrated approach.  But the code of that module would be much more
complex (and RAM-consuming)  than a corresponding integrated version.

Also, there are some speed-optimizations possible _only_ with full integration,
such as using macros instead of function calls, which reduces overhead to zero.
(To be used carefully of course, in time-critical parts, since it wastes RAM.)


>> For some cases specialization is a good thing, just as generalization is
>> for other cases.  It is necessary to use a well balanced compromise between
>> these two principles to achieve the best results. Both I and Peter believe
>> that separating IP from the STiK kernel means severe speed penalties, though
>> we also agree that the other stuff gains usefulness by its modularity.
>
>I agree with your first comment. It seems that you and Peter are
>unconvinced about the speed of calling module functions. If done as I have
>suggested above, I don't believe this is too much of a problem.

I do realize that, since I have used similar methods myself when combining
programs that _had_ to be separate, and yet must have shared routines/data.
But such an interface for IP is unique, containing stuff not used/useable by
any other STX module.  So it is not really suitable to be an STX module.

If there were some other (forcing) reason to have IP as a separate module,
then I would recommend using such methods as you describe to make a unique
kind of IP module (non-STX), but I do not know any such reason.  
Thus, I still recommend the fully integrated approach.


>If either you or Peter would like to give some real examples of how
>'severe speed penalties' will be incurred, I would be glad.

Peter is doing the actual IP coding, so he'd better answer that, but be ready
for an answer that assumes a "standard" STX module.  If special assumptions
and exceptions must be made for an IP module it is not correctly modular.


>Calling such functions in a module system should not be very different
>from calling them within a kernel, from a speed perspective.

That is true, but only if the method you describe is used, which in turn
means that IP is still not a "standard" STX module, but will have to have
some special code in STiK to load and initialize it properly.

-------------------------------------------------------------------------
Regards:  Ronald Andersson                     mailto:dlanor@oden.se
                                               http://www.oden.se/~dlanor
-------------------------------------------------------------------------
