
                                Algorithm Corner
                                ================
                                 By Simon Rigby

     This month I will  be  looking  (briefly)  at  two  questions posed in
     previous issues of ICTARI.  The  first  concerns  displaying Star Wars
     type text. The second concerns file compression.

     Any members with general questions of  the  form 'How Do I ...' please
     let me know and  I'll  try  my  best.  I  cannot  guarantee a flawless
     method, but I'm sure I can  point  you  in the right direction or give
     you some idea of the possibilities.

                                  3D Type Text
                                  ------------

     The first point to note  is  that  you  will  need  to design your own
     letters using vector points; bitmaps would  need to be scaled which is
     beyond the scope of this section. The points for all the letters would
     need to be arranged as if they were arranged normally down the screen.
     Basically, this section will look at  changing points from flat screen
     to simulated perspective.

     What is perspective?

     Imagine being in a  cinema  and  looking  at  the  screen. This is our
     screen as we see it. Now imagine  looking  up;  you will see a beam of
     light getting smaller as it  heads  towards  the projector. If we move
     the screen nearer  to  the  projector,  the  screen  image will become
     smaller.

                                    /|
                                   / |
                                  /  |
                                 /   |
                                /|   |
                               / |   |<-big image
                     projector \ |<--|--small image
                                \|   |
                                 \   |
                                  \  |
                                   \ |
                                    \|
                             <--------Z axis

     It is fairly straightforward to see that as  Z is increased, the X & Y
     points will head towards  the  middle  of  the  screen,  until (at the
     projector) any point will be a dot at the middle of the screen.

     If the centre of the screen is at X,Y and a point is at x,y then

     x'=((x-X)/z)+X
     y'=((y-Y)/z)+Y providing Z>0

     So, if we feed  our  x,y  points  into  a  routine that performs these
     calculations on them, we will get a  new  set of x',y' points out that
     scale according to the z value given.

     But does this help us?

     We haven't got a z to work with  and  all this does is make our screen
     points gradually smaller as we increase z.  Well, now look at the line
     from the bottom of the screen to  the  projector. If we put our points
     on this plane, the text will move away and up as z is increased.

             -----
               |                 ---
               |       ---        |
               |        |         |
             -----     ---     -------
             Normal   Scaled   Along Z

     So, if we take our points  as  being  x,z points instead of x,y points
     and say that the y value will be  at  the bottom of the screen for all
     points:

             x'=((x-X)/z)+X
             y'=((y-Y)/z)+Y but y=bottom-of-screen

     so for low rez, (320 by 200)
             x'=((x-160)/z)+160
             y'=((199-100)/z)+100 but our y is z in this equation

     so...
             x'=((x-160)/y)+160
             y'=(99/y)+100

     If we feed all our points  into  this  equation  we  will get a set of
     points to draw that looks as if it is disappearing into the distance.

     Finally...
     If we have an offset counter that  starts well negative (say -200) and
     add one to it before adding  it  to  our starting y values before each
     calculation, then each draw will show  the objects going up the screen
     (remembering to clear the last draw before the next draw).

     The only thing to remember is that z must not be zero so any y+offset
     value at zero must be made into a  1. Setting a clip rectangle for the
     screen will mean that all points can be passed and only those in range
     will be drawn.

     I hope this makes sense  to  someone  -  try  playing with simple line
     shapes and let us know how you get on.


                          Compression (Huffman Coding)
                          ----------------------------

     Imagine we have a (very) small file  containing the letters A,B,C & D.
     Imagine that the file is ABADACAA.

     Now, we can store this as  eight  bytes  using the ASCII codes for the
     letters, but this is very inefficient. We  can see that there are only
     four different letters and  so  could  use  2  bits  to store the four
     possibilities of letter. This would mean  storing a table at the start
     of the file containing each code and  the letter it represents. So, we
     can have the letters represented by a binary tree as follows:

                    *
                 1 / \0     So, A=11, B=10, C=01, D=00
                  /   \
                 *     *
               1/ \0 1/ \0
               A   B C   D

     This would mean storing the tree instead  of a table, which would make
     the compressed file bigger than necessary, but bear with me...

     Each tree entry  point would have  two  values;  a pointer to the left
     tree entry and a pointer to the right tree entry. If there was no left
     (or right) entry, the  entry  would  contain  the  letter instead of a
     pointer (one bit of the entry being reserved for whether the entry was
     a letter or a pointer).

     Now look at this tree...

               *
              / \
             A   *
                / \
               B   *
                  / \
                 C   D

     This is the worst case scenario  for  a  tree  where there is one less
     node than there are letters. But is  this the worst case for us? Let's
     look again ...

               *
             1/ \0       A=1, B=01, C=001, D=000
             A   *
               1/ \0
               B   *
                 1/ \0
                 C   D

     Now, with 2 bits per letter and eight letters, there are 16 bits used.
     With the situation above we have...

             5 x A = 5 bits
             1 x B = 2 bits
             1 x C = 3 bits
             1 x D = 3 bits
                    -------
                    13 bits a saving of 3 bits - why?

     It's all down to  probability;  if  a  letter  occurs  more often than
     another it would be profitable to use less bits for that letter at the
     expense of less used letters.

     The procedure is as follows:-

     1] Make a table of all letters (bytes) and how often they occur.
     2] Take the two letters (or pointers  or one of each) that occur least
        and add the number of times they occur together.
     3] Put the two letters (or pointers) with the smallest occurrence into
        a tree.
     4] Remove the two least occurring letters from the table.
     5] Put the tree pointer into the  table  with an occurrence of the two
        letter's occurrences added together.
     6] Repeat from step 2 until no entries left in table.

     This is the optimum tree for this distribution.

                   *       Look familiar?
                  / \
                 *   A
                / \
               *   B
              / \
             C   D

     Now go through the file again and  walk  the tree to find each letter,
     adding a 1 bit to the output stream for  a left branch and a 0 bit for
     a right branch. At the end of  the  file,  write the last few bits and
     pad out with zero's to the end of the byte.

     To unpack the file again,  simply  load  in  the tree structure and go
     down the tree taking a left for a 1  bit and a right for a 0 bit until
     each letter is found.

     Note that the length of the original file  needs to be known to stop a
     spurious character possibly occurring at the  end  of the file, due to
     the padded bits.

     Improvements:
       Note that the  compression  factor  of  Huffman  coding  is improved
     whenever there are a lot more of one value than another (as above) but
     that it will never be  worse  than  the  original scheme (ignoring the
     overhead of storing the tree structure and file length.

     Therefore -

     Samples can be improved by delta coding first, then Huffman coding (as
     above). Delta coding  means  taking  the  difference  between adjacent
     values and then storing this value  instead of the true value. Because
     samples tend to increase and  decrease  slowly the differences tend to
     stay small.

             eg. 25 30 40 45 47 50 46 40 37 30 24
             is  25 05 10 05 02 03 -4 -6 -3 -7 -6

     You can see that over the  length  of  the sample, the values will all
     become concentrated around small positive and negative values, perfect
     for Huffman coding.

     True colour pictures that  are  based  on  real (or ray-traced) images
     would have slowly changing RGB values. If all the values of red (green
     and  blue)  are  separately  delta   coded  and  then  Huffman  coded,
     substantial compression can be achieved.

     Most formats can be delta (or run length) encoded, which will probably
     help the subsequent compression.

                         -----------------------------

     That's all for this month, next month is (hopefully) up to you...

                     Simon Rigby.
                   a.k.a. PoshPaws
