Post Reply 
Optimized Stack Operations
02-25-2019, 06:35 AM
Post: #1
Optimized Stack Operations
Hello all!

I'm developing a scientific RPN calculator with OLED Display, ATTINY85, 16 keys and a CR2032 battery - and I'm almost done.
As (flash) memory is (always) short (8.192 bytes) I have to implement a lot of functions by using the calculator itself (and not some bloated math library).
This can be done by calling basic subroutines (one byte commands) for stack operations.

For example to calculate the cosine of x (cos(x)) I have to use the (intrinsic) sine function and calculating sqrt(1-sin(x)*sin(x)) with stack operations - and use these steps:
Code:
SIN ENTER * CHS 1 + SQRT

Many other functions are implemented like this:
Code:

      COS (Cosine)
        COMMANDS (5): SIN ENTER * CHS 1 + SQRT

        Converts    4 mem  -    to:     -
        stack[]     ---------         -----
        (incl.      3  t   -            -        cos(x) = sqrt(1 - sin(x)*sin(x))
        mem)        2  z   -            -
        from:       1  y   -            -
                    0  x   x          cos(x)


      TAN (Tangent)
        COMMANDS (9): SIN ENTER ENTER * CHS 1 + SQRT /

        Converts    4 mem  -    to:     -
        stack[]     ---------         -----               sin(x)          sin(x)
        (incl.      3  t   -            -        tan(x) = ------ = -----------------------
        mem)        2  z   -            -                 cos(x)   sqrt(1 - sin(x)*sin(x))
        from:       1  y   -            -
                    0  x   x          tan(x)


      ACOS (Arcus Cosine)
        COMMANDS (4): ASIN CHS 90 +

        Converts    4 mem  -    to:     -
        stack[]     ---------         ------
        (incl.      3  t   -            -        acos(x) = 90 - asin(x)
        mem)        2  z   -            -
        from:       1  y   -            -
                    0  x   x          acos(x)


      ATAN (Arcus Tangent)
        COMMANDS (10): ENTER ENTER ENTER * 1 + SQRT INV * ASIN

        Converts    4 mem  -    to:     -
        stack[]     ---------         ------                           x
        (incl.      3  t   -            -        atan(x) = asin( ------------- )
        mem)        2  z   -            -                        sqrt(1 + x*x)
        from:       1  y   -            -
                    0  x   x          atan(x)


      PV (Present Value)
        COMMANDS (14): CHS SWAP ENTER 1 + SWAP ROT SWAP PWR CHS 1 + ROTup /

        Converts    4 mem  -    to:     -
        stack[]     ---------         -------              1 - (1+i)^-n
        (incl.      3  t   -            -        PV(i,n) = ------------
        mem)        2  z   -            -                       i
        from:       1  y   i            -
                    0  x   n          PV(i,n)


      GAMMA (due to formula of Nemes)
        COMMANDS (46): 1 + ENTER ENTER INV EXP ENTER INV CHS + 2 / * SWAP 6 PWR 810 * INV
                       + LN * 2 / SWAP CHS + ROT ROT LN SWAP .5 - * + .9189385 + EXP

        Converts    4 mem  -    to:    -
        stack[]     ---------         ---                 ln(2PI)                       x
        (incl.      3  t   -           -         ln(x!) = ------- + (x-1/2)*ln(x) - x + - * ln(x*sinh(1/x) + 1/(810*x^6))
        mem)        2  z   -           -                     2                          2
        from:       1  y   -           -
                    0  x   x           x!


      SINH (Hyperbolic Sine)
        COMMANDS (9): ENTER ENTER EXP SWAP CHS EXP - 2 /

        Converts    4 mem  -    to:      -
        stack[]     ---------         -------              exp(x) - exp(-x)
        (incl.      3  t   -             -       sinh(x) = ----------------
        mem)        2  z   -             -                        2
        from:       1  y   -             -
                    0  x   x          sinh(x)


      COSH (Hyperbolic Cosine)
        COMMANDS (9): ENTER ENTER EXP SWAP CHS EXP + 2 /

        Converts    4 mem  -    to:      -
        stack[]     ---------         -------              exp(x) + exp(-x)
        (incl.      3  t   -             -       cosh(x) = ----------------
        mem)        2  z   -             -                        2
        from:       1  y   -             -
                    0  x   x          cosh(x)


      TANH (Hyperbolic Tangent)
        COMMANDS (12): 2 * CHS EXP ENTER CHS 1 + SWAP 1 + /

        Converts    4 mem  -    to:      -
        stack[]     ---------         -------              exp(x) - exp(-x)
        (incl.      3  t   -             -       tanh(x) = ----------------
        mem)        2  z   -             -                 exp(x) + exp(-x)
        from:       1  y   -             -
                    0  x   x          tanh(x)


      ASINH (Area Hyperbolic Sine)
        COMMANDS (8): ENTER ENTER * 1 + SQRT + LN

        Converts    4 mem  -    to:      -
        stack[]     ---------         --------
        (incl.      3  t   -             -       asinh(x) = ln(x + sqrt(x*x + 1))
        mem)        2  z   -             -
        from:       1  y   -             -
                    0  x   x          asinh(x)


      ACOSH (Area Hyperbolic Cosine)
        COMMANDS (8): ENTER ENTER * 1 - SQRT + LN

        Converts    4 mem  -    to:      -
        stack[]     ---------         --------
        (incl.      3  t   -             -       acosh(x) = ln(x + sqrt(x*x - 1))
        mem)        2  z   -             -
        from:       1  y   -             -
                    0  x   x          acosh(x)


      ATANH (Area Hyperbolic Tangent)
        COMMANDS (11): ENTER ENTER 1 + SWAP CHS 1 + / SQRT LN

        Converts    4 mem  -    to:      -
        stack[]     ---------         --------                      1 + x
        (incl.      3  t   -             -       atanh(x) = ln(sqrt(-----))
        mem)        2  z   -             -                          1 - x
        from:       1  y   -             -
                    0  x   x          atanh(x)


      SUM (Prepare stack/arguments for summarizing)
        COMMANDS (12): 1 STO ROT ENTER ENTER * ROT * ROT SWAP SQRT SUM1

        Converts    4 mem  -    to:    1    and where
        stack[]     ---------         ---   SUM1 adds
        (incl.      3  t   -          x*y   stack[i]
        mem)        2  z   -          x*x   to
        from:       1  y   y           y    sum[i]
                    0  x   x           x


      ND (Normal Distribution: Probability Density and Cumulative Distribution Function)
        COMMANDS (38): ENTER ENTER ENTER * * .07 * CHS SWAP 1.6 * CHS + EXP 1 + INV
                       SWAP ENTER * CHS 2 / EXP 0.3989423 *

        Converts    4 mem  -    to:    -                  (x)
        stack[]     ---------         ---        CDF = integral(PDF) = 1/(1 + exp(-0.07*x^3 - 1.6*x))
        (incl.      3  t   -           -                 (-inf)
        mem)        2  z   -           -
        from:       1  y   -          CDF        PDF = 1/sqrt(2*PI) * exp(-x*x/2)
                    0  x   x          PDF


      R2P (Rectangular to Polar Coordinates)
        COMMANDS (14): ENTER * SWAP ENTER ENTER ROT * + SQRT ENTER ROT / ASIN ROTup

        Converts    4 mem  -    to:    -
        stack[]     ---------         ---      r = sqrt(x*x + y*y)
        (incl.      3  t   -           -
        mem)        2  z   -           -                 y
        from:       1  y   y           a       a = atan(---)
                    0  x   x           r                 x


      P2R (Polar to Rectangular Coordinates)
        COMMANDS (15): SWAP SIN ENTER ENTER * CHS 1 + SQRT ROT * ROT * SWAP ROT

        Converts    4 mem  -    to:    -
        stack[]     ---------         ---      y = r * sin(a)
        (incl.      3  t   -           -
        mem)        2  z   -           -
        from:       1  y   a           y       x = r * cos(a)
                    0  x   r           x


      STAT (Statistik, Mean Value, Standard Deviation)
        COMMANDS (16): SWAP ROT ENTER RCL / ENTER ROT * CHS + RCL 1 - / SQRT SWAP

        Converts    4 mem  n    to:    -                 XX - X^2 / n
        stack[]     ---------         ---      d = sqrt(--------------)
        (incl.      3  t   XY          -                     n - 1
        mem)        2  z   XX          -
        from:       1  y   Y           d       m = X / n
                    0  x   X           m


      QE (Quadratic Equation)
        COMMANDS (25): SWAP 2 / CHS ENTER ENTER * SWAP ROT SWAP - SQRT ROT ROT
                       ENTER ROTup ENTER ROT SWAP ROT - ROT + SWAP ROT

        Converts    4 mem  -    to:    -
        stack[]     ---------         ---      y = x*x + p *x + q
        (incl.      3  t   -           -
        mem)        2  z   -           -       x1 = -p/2 + sqrt((p/2)^2 - q)
        from:       1  y   p           x2
                    0  x   q           x1      x2 = -p/2 - sqrt((p/2)^2 - q)


      LR (Linear Regression)
        COMMANDS (33): SUM2STACK * SWAP RCL * ROTup RCL * ROTup SHADOWSAVE
                       SUM2STACK ENTER * SHADOWLOAD1
                       SWAP ROT - ROT - ROTup / ENTER ENTER ENTER SHADOWSAVE
                       SUM2STACK SHADOWLOAD2 ROTup * - RCL / SWAP

        Converts    4 mem  n    to:    -
        stack[]     ---------         ---      y = a * x + b
        (incl.      3  t   XY          -
        mem)        2  z   XX          -       a = (XY*n - X*Y) / (XX*n - X*X)
        from:       1  y   Y           b
                    0  x   X           a       b = (Y - X*a) / n

This already works but it is far from beeing optimized - because I'm not really a specialist in stack operations.
Unfortunately every operation costs 2 bytes of flash memory - and - slows down the code execution (running with a slow framerate (to save battery power) of 10 fps every calculation step needs 100 milliseconds).

So please help me to optimize those stack operations - every better sequence of commands or any idea is highly appreciated.

Thanks in advance.
deetee


Attached File(s) Thumbnail(s)
   
Find all posts by this user
Quote this message in a reply
02-25-2019, 07:57 AM
Post: #2
RE: Optimized Stack Operations
Couldn't you call a common subroutine to handle both sine and cosine? They are nearly the same apart from an x offset.

— Ian Abbott
Find all posts by this user
Quote this message in a reply
02-25-2019, 08:03 AM
Post: #3
RE: Optimized Stack Operations
OLED display running off a CR2032?

What kind of battery life are you expecting to get out of that?
Find all posts by this user
Quote this message in a reply
02-25-2019, 08:42 AM (This post was last modified: 02-25-2019 08:43 AM by deetee.)
Post: #4
RE: Optimized Stack Operations
Power management is essential - so I implemented automatic dimming, display off and finally deep sleep.

Running on a single battery (CR2032) my calculator draws a current of 10 mA (bright display with a lot of "8s"). With a battery capacity of at least 200 mAh it can calculate approximately 20 hours.
After dimming the current falls to 5.5 mA, after deactivating the display 1.1 mA are needed.
In sleep mode it consumes less than 0.25 mA. With a battery capacity of at least 200 mAh it lasts more than a month in sleep mode.

And finally I plan to use 2 (parallel) batteries.
Find all posts by this user
Quote this message in a reply
02-25-2019, 08:45 AM
Post: #5
RE: Optimized Stack Operations
(02-25-2019 07:57 AM)ijabbott Wrote:  They are nearly the same apart from an x offset.

It was also my idea to use \(\cos \theta = \sin \theta + \frac{\pi}{2}\):
Code:
PI
2
/
+
SIN

Maybe you already keep \(\frac{\pi}{2}\) as a constant.
Another benefit is that you don't have to deal with the sign of the square root if \(\cos \theta < 0\).

Cheers
Thomas
Find all posts by this user
Quote this message in a reply
02-25-2019, 09:04 AM (This post was last modified: 02-25-2019 09:05 AM by deetee.)
Post: #6
RE: Optimized Stack Operations
Wow - I did not expect to be so blind in this early stage.

Now I calculate COS with:
Code:
 CHS 90 + SIN
... which reduces this sequence to 5 bytes.

I'm looking forward when you improve my GAMMA-function. :-)

Thanks
Find all posts by this user
Quote this message in a reply
02-25-2019, 09:14 AM
Post: #7
RE: Optimized Stack Operations
Exactly how are you calculating sin()? If you're using CORDIC then it should produce both sin() and cos() simultaneously anyway, thus giving you tan() as well while you're at it.
Find all posts by this user
Quote this message in a reply
02-25-2019, 09:45 AM
Post: #8
RE: Optimized Stack Operations
Actually I'm using something odd - but it makes sense for saving memory:

I use one subroutine to calculate exp(), sin() and asin() - all have "similar" Taylor polynoms:
Code:

static double _exp_sin_asin(double f, byte nr) { // Calculate exp, sin or asin
  double result = f; // Start values for sin or asin
  double frac = f;
  if (nr == BITEXP) result = frac = 1.0; // Start values for exp
  for (int i = 1; _abs(frac) > TINYNUMBER && i < MAXITERATE; i++) {
    int i2 = 2 * i, i2p = 2 * i + 1, i2m = 2 * i - 1, i2m2 = i2m * i2m;
    double ffi2i2p = f * f / (i2 * i2p);
    if (nr == BITEXP) frac *= f / i; // Fraction for exp
    else if (nr == BITSIN) frac *=  -ffi2i2p; // Fraction for sin
    else frac *= ffi2i2p * i2m2; // Fraction for asin
    result += frac;
  }
  return (result);
}
Find all posts by this user
Quote this message in a reply
02-25-2019, 09:50 AM
Post: #9
RE: Optimized Stack Operations
But by using this and therefore the stdlib math library (I see a "double" type in there), you're far from saving memory!

Using CORDIC you just need a few constants and everything is calculated using addition, subtraction, bit shifting and table look-ups. Not a float or double in sight.
Find all posts by this user
Quote this message in a reply
02-25-2019, 09:57 AM
Post: #10
RE: Optimized Stack Operations
I didn't know CORDIC - but it seems very interesting.
Do you have a good link for programming CORDIC? - WIKI seems to be to "theoretical".

Thanks for the hint.
Find all posts by this user
Quote this message in a reply
02-25-2019, 10:00 AM
Post: #11
RE: Optimized Stack Operations
I can't remember exactly where I found it but I did find documentation on the 'net that enabled me to write my own math library with up to 24 digits of precision. It's there if you look in the right places Smile

It's not yet ready for prime time because, well, life kind of took over a while back, but when it is ready it'll be open source.
Find all posts by this user
Quote this message in a reply
02-25-2019, 10:09 AM
Post: #12
RE: Optimized Stack Operations
This document was a great help to me.
Find all posts by this user
Quote this message in a reply
02-25-2019, 12:37 PM
Post: #13
RE: Optimized Stack Operations
Do a search on Cordic in the Forum as there many, many references

Denny Tuckerman
Visit this user's website Find all posts by this user
Quote this message in a reply
02-25-2019, 03:19 PM
Post: #14
RE: Optimized Stack Operations
(02-25-2019 09:04 AM)deetee Wrote:  Now I calculate COS with:
Code:
 CHS 90 + SIN
... which reduces this sequence to 5 bytes.

You can also use:
Code:
90 + SIN
… and then there were four.

Quote:I'm looking forward when you improve my GAMMA-function. :-)

Viktor T. Toth has implemented the Gamma function for a lot of calculators.
So you might find some inspirations.

Cheers
Thomas
Find all posts by this user
Quote this message in a reply
02-25-2019, 03:26 PM
Post: #15
RE: Optimized Stack Operations
(02-25-2019 09:57 AM)deetee Wrote:  Do you have a good link for programming CORDIC? - WIKI seems to be to "theoretical".

This might be something for you: Exploring the CORDIC algorithm with the WP-34S

Cheers
Thomas
Find all posts by this user
Quote this message in a reply
02-25-2019, 08:04 PM
Post: #16
RE: Optimized Stack Operations
While your question is broad and there are many ways to contribute (as you see above) one reference tool that may be useful is Tony Hutchins' XYZT Tables document, which shows the quickest way (least number of basic RPN instructions) to go from one stack configuration to another. It's much harder to explain than it is to grasp once you see it.

Gene shared a link here. A must-have document when dealing with stackrobatics.

--Bob Prosperi
Find all posts by this user
Quote this message in a reply
02-25-2019, 09:03 PM
Post: #17
RE: Optimized Stack Operations
(02-25-2019 08:04 PM)rprosperi Wrote:  While your question is broad and there are many ways to contribute (as you see above) one reference tool that may be useful is Tony Hutchins' XYZT Tables document, which shows the quickest way (least number of basic RPN instructions) to go from one stack configuration to another. It's much harder to explain than it is to grasp once you see it.

Gene shared a link here. A must-have document when dealing with stackrobatics.

Not working anymore.
Use Pedro's instead

Greetings,
    Massimo

-+×÷ ↔ left is right and right is wrong
Visit this user's website Find all posts by this user
Quote this message in a reply
02-25-2019, 09:49 PM
Post: #18
RE: Optimized Stack Operations
(02-25-2019 09:03 PM)Massimo Gnerucci Wrote:  Not working anymore.
Use Pedro's instead

Thanks for fixing that Massimo.

I actually uploaded a copy of the PDF to OneDrive to share, and then thought 'hey, someone must have done this before now', then searched and found Gene's post and saw the thread had other discussion that may be of interest to the OP, so just posted that link. Absolutely should have verified the link first though... (smacking forehead).

--Bob Prosperi
Find all posts by this user
Quote this message in a reply
02-26-2019, 06:57 AM
Post: #19
RE: Optimized Stack Operations
(02-25-2019 03:19 PM)Thomas Klemm Wrote:  Viktor T. Toth has implemented the Gamma function for a lot of calculators.
So you might find some inspirations.

This helped me a lot - now I calculate GAMMA with a shorter and more "stack friendly" formula (with a precision of at least 3 digits):
Code:
      GAMMA (due to formula of Nemes)
        COMMANDS (35): 1 + ENTER ENTER ENTER 10 * INV CHS ROTup 12 * + INV +
                       1 EXP / ROTup PWR ROTup INV SQRT * 2.506628 *

        Converts    4 mem  -    to:    -
        stack[]     ---------         ---                                 x + 1/(12*x - 1/10/x)
        (incl.      3  t   -           -         (x-1)! = sqrt(2*PI/x)* (----------------------)^x
        mem)        2  z   -           -                                            e
        from:       1  y   -           -
                    0  x   x           x!

But I still think that some bytes are to save with optimized stack handling.

Thanks
Find all posts by this user
Quote this message in a reply
02-26-2019, 09:16 AM
Post: #20
RE: Optimized Stack Operations
(02-26-2019 06:57 AM)deetee Wrote:  But I still think that some bytes are to save with optimized stack handling.

Code:
1 + ENTER ENTER ENTER 12 * ROTup 10 * INV – INV +
1 EXP / ROTup PWR 2.506628 ROTup SQRT / *

If there's also a SWAP or X<>Y command some of the ROTup commands could be replaced.

Cheers
Thomas
Find all posts by this user
Quote this message in a reply
Post Reply 




User(s) browsing this thread: 4 Guest(s)