Threaded Mode | Linear Mode

Dieter · (This post was last modified: 08-07-2016 08:48 PM by Dieter.)

I think most of us will know the usual Newton method for finding the roots of a function. This requires the evaluation of the function f(x) as well as its derivative f'(x), which usually means two function calls per iteration step.

For a project where the derivative cannot be calculated fast and easily I now tried the following method that is based on a method already discussed earlier, e.g. here and there. The idea here is the simulaneous calculation of f(x) and f'(x) via complex math. On the 35s this can be done quite easily – at least within the supported function set.

Here is a sample program to illustrate the idea.

Code:

S001 LBL S

S002 INPUT X

S003 1E-6

S004 STO H

S005 i

S006 RCLx H

S007 RCL+ X

S008 XEQ F001

S009 ARG

S010 COS

S011 LASTX

S012 SIN

S013 /

S014 RCLx H

S015 STO- X

S016 FIX 6

S017 RND

S018 X≠0?

S019 GTO S005

S020 RCL X

S021 VIEW X

S022 GTO S001

F001 LBL F

F002 ENTER

F003 ENTER

F004 ENTER

F005 1

F006 -

F007 x

F008 1

F009 -

F010 x

F011 0,5

F012 +

F013 RTN

Please note: this is just to show the idea, not a fully working program. ;-)

Steps S009...S014 calculate the quotient of the real (approx. f(x)) and imaginary parf (approx. f'(x)). The sine and cosine should not be replaced by 1/tangent since f(x)=0 will yield a real part of ±90° where the tangent is not defined (a cotangent function would be nice here...).

Example:

Code your function f(x) at LBL F, assuming the argument x in the X-register. The sample function is x³–x²–x+0,5 = 0. After XEQ S enter an initial guess for x.

Code:

[XEQ] S [ENTER]     X?

      0 [R/S]       X= 0,403032

[XEQ] S [ENTER]     X?

      2 [R/S]       X= 1,451606

[XEQ] S [ENTER]     X?

     -2 [R/S]       X=-0,854638

So this seems to work.

What about limitations of this method? As far as I get it both f(x) as well as f'(x) are only approximate. I wonder how accurate especially f(x) is evaluated using this method. Maybe the math experts here can say more about this. ;-)

Dieter

Thomas Klemm · (This post was last modified: 09-24-2018 01:59 AM by Thomas Klemm.)

(08-07-2016 08:47 PM)Dieter Wrote: What about limitations of this method? As far as I get it both f(x) as well as f'(x) are only approximate. I wonder how accurate especially f(x) is evaluated using this method. Maybe the math experts here can say more about this. ;-)

Using the Taylor series we can separate the real and imaginary part of \(f(x+ih)\):

\(\begin{align*}
\Re[f(x+ih)]&=f(x)-\frac{f''(x)}{2!}h^2+O(h^4)\\
\Im[f(x+ih)]&=h(f'(x)-\frac{f'''(x)}{3!}h^2+O(h^4))
\end{align*}\)

Thus by choosing a small \(h\) we can make the difference as small as we want.
You can verify this by using \(h=10^{-50}\) and the \(\sin\) function:

sin(2 + 1e-50 i) = 9.09297426826e-1 - 4.16146836547e-51 i

sin(2) = 9.09297426826e-1
cos(2) = -4.16146836547e-1

Both values are exact. But for a calculator with 12 significant digits something smaller like \(10^{-10}\) might be enough. It really depends on the value of the 2nd and 3rd derivative at the root.

We could go further and try to use \(h=10^{-99}\). But then we may end up with 0.
Thus we have to choose \(h\) small enough to avoid errors but not too small.

Kind regards
Thomas

Ángel Martin · (This post was last modified: 09-25-2018 01:54 PM by Ángel Martin.)

Nicely done, it piqued my curiosity so I went ahead and jotted down the following 41Z version of the same concept.- About 30 steps in this proof of concept; which expects the function name in ALPHA, h in Y and the xo guess (a real value) in X.

Code:

01  LBL "ZNWT"   

02  ASTO 00    ; F.Name

03  X<>Y       

04  STO 01     ; h

05  CLX        ; ensure Im(z)=0

06  X<>Y   

07  LBL 00     ; loop

08  FS? 10     ; show value?

09  ZAVIEW     ; yes

10  ZSTO 01    ; save Zo = xo + 0i

11  XEQ IND 00 ; calculate f(Zo)

12  ZSTO 02    ; save f(Zo)

13  ZRCL 01    ; Zo

14  RCL 01     ; h

15  X<>Y    

16  XEQ IND 00 ; calculate f(Zo+i.h)

17  X<>Y       ; Im[f(Zo+i.h)]

18  RCL 01     ; h

19  /          ; f'(xo)

20  ST/ 04     ; divides Re[f(Zo)] by it

21  ST/ 05     ; divides Im[f(Zo)] (which should be zero anyway)

22  ZRCL 01    ; Zo

23  ZENTER^   

24  ZRCL 02    ; f(xo)/f'(xo)

25  Z-         ; x1

26  Z#W?       ; converged?

27  GTO 00     ; no, next iteration

28  CLA        ; yes, reset ALPHA

29  ARCL 00

30  END        ; done

The execution shows the successive values of the root if user flag 10 is set (a la PPC-ROM).
Convergence criteria always uses 9 decimal places - irrespective of the display FIX settings.

The function is to be programmed using 41Z functions (definitely a super-set of those in the 35S), as a complex equation.

To use Dieter's same example:

Code:

01  LBL "Z1"

02  Z^3

03  LASTZ

04  Z^2

05  LASTZ

06  Z+

07  Z-

08  ,5

09  +

10  END

Results:

ALPHA, "Z1", ALPHA
0.1, ENTER^, 2, XEQ "ZNWT" -> 1.451605963
0.1, ENTER^, 0, XEQ "ZNWT" -> 0.403031717
0.1, ENTER^, -2, XEQ "ZNWT"-> -0.854637680

Supposedly more accurate results would be obtained with smaller values of h...

Cheers,
ÁM

Thomas Klemm · 09-26-2018, 07:56 AM

You don't have to execute the function twice since both, value and derivative are returned:

Code:

01 LBL "ZNWT"   ; h         x

02 ASTO 02      ;

03 ZTRP         ; x         h

04 ZSTO 00      ; R00: h    R01: x

05 LBL 00       ; loop

06 VIEW 01      ; view x

07 ZRCL 00      ; x         h

08 XEQ IND 02   ; f(x + ih)

09 X<>Y         ; ∆f(x)h    f(x)

10 /            ; ∆x/h

11 RCL 00       ; h         ∆x/h

12 *            ; ∆x

13 ST- 01       ; x' = x - ∆x

14 RND          ; ~∆x

15 X#0?         ; ∆x ~ 0 ?

16 GTO 00       ; loop

17 RCL 01       ; x

18 END          ; return x

Example:

2
ENTER
1E-9
XEQ "ZNWT"

2.000000000
1.642857143
1.487473705
1.453261463
1.451609751
1.451605963

(09-25-2018 11:46 AM)Ángel Martin Wrote: The execution shows the successive values of the root if user flag 10 is set (a la PPC-ROM).
Convergence criteria always uses 9 decimal places - irrespective of the display FIX settings.

Feel free to change that to your needs: Add the check of user flag 10 and remove RND.

Quote:Supposedly more accurate results would be obtained with smaller values of h...

Usually 1e-9 will be small enough. But you might also use 1e-50.

Kind regards
Thomas

Ángel Martin · (This post was last modified: 09-26-2018 11:21 AM by Ángel Martin.)

(09-26-2018 07:56 AM)Thomas Klemm Wrote: You don't have to execute the function twice since both, value and derivative are returned:

Brilliant Thomas, many thanks for snapping me out of my stupor to see what was before my nose!

And so glad to see there's at least one more 41Z-literate person in the world ;-)
BTW nice move choosing R02 for the function name instead of R00 - this allows using the 41Z functions on ZR00, which uses "0" as default parameter so it's not required in the program.

Really impressive - although you must be using an older revision of the module, because with the latest one (the 'Deluxe" edition) the real part is stored in the lower register of the pair, i.e. ZSTO 00 stores Re(z) in R00 and Im(z) in R01

So re-writing the code for the 41Z-Deluxe conventions (transposing R00 and R01), and starting with h in Y, x guess in X:

Code:

01 LBL "ZNWT"   ; x         h

02 ASTO 02      ;

03 ZSTO 00      ; R01: h    R00: x

04 LBL 00       ; loop

05 VIEW 00      ; view x

06 ZRCL (00)    ; x         h

07 XEQ IND 02   ; f(x + ih)

08 X<>Y         ; ∆f(x)h    f(x)

09 /            ; ∆x/h

10 RCL 01       ; h         ∆x/h

11 *            ; ∆x

12 RND         ;~∆x

13 ST- 00       ; x' = x - ∆x

14 X#0?         ; ∆x ~ 0 ?

15 GTO 00       ; loop

16 RCL 00       ; x

17 END          ; return x

17 steps, nothing can beat that ;-)

Thanks much again.
Best,
ÁM

PS. As it turns out removing RND is not a good idea because some times the oscillation in the 10th. decimal place causes the test X#0? never to occur. This is a well-known behavior of the Newton method, so nothing new here...

Thomas Klemm · 09-26-2018, 12:49 PM

(09-26-2018 09:05 AM)Ángel Martin Wrote: And so glad to see there's at least one more 41Z-literate person in the world ;-)

You made me read your excellent documentation.
And who in this forum doesn't love to read manuals?

Quote:BTW nice move choosing R02 for the function name instead of R00 - this allows using the 41Z functions on ZR00, which uses "0" as default parameter so it's not required in the program.

I've tried that but somehow it didn't work. But I must admit I didn't invest too much into it.

Quote:Really impressive - although you must be using an older revision of the module, because with the latest one (the 'Deluxe" edition) the real part is stored in the lower register of the pair, i.e. ZSTO 00 stores Re(z) in R00 and Im(z) in R01

I was using a somewhat buggy/unstable emulator my41CX on my iPhone since I didn't have access to an emulator on a computer.
When the calculator is turned off it displays for a moment: 41Z-REV: 9Z

Cheers
Thomas

Dieter · 09-27-2018, 12:06 PM

(09-26-2018 09:05 AM)Ángel Martin Wrote:

Code:

10 RCL 01 ; h ∆x/h 11 * ; ∆x 12 RND ;~∆x 13 ST- 00 ; x' = x - ∆x 14 X#0? ; ∆x ~ 0 ? 15 GTO 00 ; loop

I don't think it's a good idea to round the calculated ∆x to display precision and then adjust x by this lower-precision value. So you should first adjust x (ST– 00) and then RND the delta and do the X≠0 test. Simply swap line 12 and 13.

Dieter

Albert Chan · (This post was last modified: 09-27-2018 12:40 PM by Albert Chan.)

(08-07-2016 08:47 PM)Dieter Wrote: I think most of us will know the usual Newton method for finding the roots of a function. This requires the evaluation of the function f(x) as well as its derivative f'(x), which usually means two function calls per iteration step.

For a project where the derivative cannot be calculated fast and easily ...
The idea here is the simulaneous calculation of f(x) and f'(x) via complex math.

Does the project reach its goal ?
I know complex math can get better derivative, but is it really fast ?

I would guess complex math for f(x) and f(x + h*I) are much slower than real f(x) and f(x+h).
Just look at effort needed to multiply of 2 complex numbers ...

Ángel Martin · (This post was last modified: 09-27-2018 12:44 PM by Ángel Martin.)

(09-27-2018 12:06 PM)Dieter Wrote: I don't think it's a good idea to round the calculated ∆x to display precision and then adjust x by this lower-precision value. So you should first adjust x (ST– 00) and then RND the delta and do the X≠0 test. Simply swap line 12 and 13.

Very true, thanks.

(09-26-2018 12:49 PM)Thomas Klemm Wrote: I was using a somewhat buggy/unstable emulator my41CX on my iPhone since I didn't have access to an emulator on a computer.
When the calculator is turned off it displays for a moment: 41Z-REV: 9Z

Yes, that's a very old revision, probably not even using the Library#4...

Ángel Martin · (This post was last modified: 09-27-2018 12:45 PM by Ángel Martin.)

(09-27-2018 12:38 PM)Albert Chan Wrote: Does the project reach its goal ?
I know complex math can get better derivative, but is it really fast ?

I would guess complex math for f(x) and f(x + h*I) are much slower than real f(x) and f(x+h).
Just look at effort needed to multiply of 2 complex numbers ...

I guess it all depends on the platform. On the 41Z it definitely meets the goal - with a single execution of the function to calculate both the function *and* its derivative.

I'd think that's also the case for the 42S and even the 35S since all complex functions are also MCODE.

Dieter · (This post was last modified: 09-27-2018 01:08 PM by Dieter.)

(09-27-2018 12:38 PM)Albert Chan Wrote: Does the project reach its goal ?
I know complex math can get better derivative, but is it really fast ?

I would guess complex math for f(x) and f(x + h*I) are much slower than real f(x) and f(x+h).
Just look at effort needed to multiply of 2 complex numbers ...

A complex multiply is just four real multiplications, and two additions/subtractions. Thats "nothing".

While I can't say much about this method in high level languages with complex number support by the compiler, I bet that on almost any calculator this method will be faster than the classic (approximative) approach with f(x) and f(x+h):

First, there is just one single function call per iteration step. The complex result holds both f(x) and f'(x). The classic method requires two calls per iteration. This alone speeds up the calculation, even if the complex functions should be somewhat slower.

More important, I don't know any programmable pocket calculator where complex functions are returned significantly slower to the user than their real counterparts. Sure, the internal calculations are more elaborate, but all this is "low level" code. The execution time for such functions is completely negligible compared to the running speed of the calculator program, which does merely a dozen or maybe a few hundred user instructions per second. The CPU that handles the complex math "under the hood" may do thousands of low level instructions per second to calculate a function that is returned almost immediately.

So all in all the complex evaluation is not significantly slower than real math, and at the same time only half of the function calls is required. Yes, that's faster. ;-)

Dieter

Valentin Albillo · 09-27-2018, 10:04 PM

.
Hi, Dieter:

(09-27-2018 01:07 PM)Dieter Wrote: A complex multiply is just four real multiplications, and two additions/subtractions.

It can be done with just three real multiplications (instead of 4) and five additions/subtractions (instead of two), i.e. trading one multiplication for three add/subs.

In user code (say RPN) this would probably fail to be advantageous because the time to execute a "*" is about the same as the time to execute a "+" or "-", which in both cases is dominated by overhead not related to the actual math operation.

However, if doing the multiplications, additions and subtractions as part of a microcoded keyword or function (such as the ones Ángel implements all the time) then the individual overheads are much less and it might be the case that calling (or implementing) a microcode (or assembly language) multiplication might be more expensive than doing three extra subs/adds, and thus the 3m+5as way would be faster than the 4m+2as one.

Ángel probably has tested this possibility and if so he knows what's better (or if not he might try it) and will perhaps share his experience with us.

Regards.
V.
.