Threaded Mode | Linear Mode

robve · 03-11-2022, 01:48 AM

The classic "sure-shot" Laguerre root finding method. C and BASIC implementations are included with comments. This version of the Laguerre method is implemented with the same optimizations applied the Householder root finder posted previously. A quick test shows that Laguerre solves roots 15% faster than the Householder method on a Sharp PC-1475 for the five examples listed below, while requiring the same number of iterations. Results may differ for other machines.

LAGUERRE POLYNOMIAL ROOT FINDER
3th order convergence, worst-case linear convergence for roots with multiplicities

Optimized implementations in C and BASIC:
- Finds complex roots z of polynomial p(z)
- Customizable floating point accuracy MACHEPS
- Uses a novel root starting point to improve performance
- Prevents overflow/nan/inf
- Prevents loss of precision when possible, e.g. in reciprocals
- Tested with a large sample of polynomials

See also: https://en.wikipedia.org/wiki/Laguerre's_method
W.H. Press et al. Numerical Recipes FORTRAN 2ed
David Goldberg. What Every Computer Scientist Should Know About Floating Point, ACM Computing Surveys, 1991
J.M. McNamee. Numerical Methods for Roots of Polynomials, Part I & II, Elsevier, Kidlington, Oxford 2009

For SHARP PC-12xx to 14xx

To use with SHARP PC models with a single display line:
- remove CURSOR and CLS
- replace PRINT with PAUSE except at line 60

1,761 bytes BASIC image (PC-1350)

Degree: max polynomial degree limited by time and memory only

Example:

(Solve x^3-2x^2-x+2=0)
[DEF-A]
N=3 A3=1 A2=-2 A1=-1 A0=2
CALCULATING...
RE          IM
          2.          0.
         -1.          0.
          1.          0.

(Solve x^3+3x^2+x+3=0)
[DEF-A]
N=3 A3=1 A2=3 A1=1 A0=3
CALCULATING...
RE          IM
          0.          1.
          0.         -1.
         -3.          0.

(Solve x^4+2x^3-13x^2-14x+24=0)
[DEF-A]
N=4 A4=1 A3=2 A2=-13 A1=-14 A0=24
CALCULATING...
RE          IM
          3.          0.
         -4.          0.
         -2.          0.
          1.          0.

(Solve x^4+2999x^3-10003e3x^2-2399e7x+24e9=0)
[DEF-A]
N=4 A4=1 A3=2999 A2=-10003E3 A1=-2399E7 A0=24E9
CALCULATING...
RE          IM
       3000.          0.
      -4000.          0.
      -2000.          0.
          1.          0.

(Solve 5x^6-45x^5+225x^4-425x^3+170x^2+370x-500=0)
[DEF-A]
N=6 A6=5 A5=-45 A4=225 A3=-425 A2=170 A1=370 A0=-500
CALCULATING...
RE          IM
          3.          4.
          3.         -4.
          2.          0.
         -1.          0.
          1.         -1.
          1.          1.

Code:

' LAGUERRE POLYNOMIAL ROOT FINDER

' 3th order convergence, worst-case linear convergence for roots with multiplicities

'

' For SHARP PC-12xx to 14xx by Robert van Engelen

' To use with SHARP PC models with a single display line:

' - remove CURSOR and CLS

' - replace PRINT with PAUSE except at line 60

'

' Optimized implementations in C and BASIC:

' - Finds complex roots z of polynomial p(z)

' - Customizable floating point accuracy MACHEPS

' - Uses a novel root starting point to improve performance

' - Prevents overflow/nan/inf

' - Prevents loss of precision when possible, e.g. in reciprocals

' - Tested with a large sample of polynomials

' 

' See also: https://en.wikipedia.org/wiki/Laguerre's_method

' W.H. Press et al. Numerical Recipes FORTRAN 2ed

' David Goldberg. What Every Computer Scientist Should Know About Floating Point, ACM Computing Surveys, 1991

' J.M. McNamee. Numerical Methods for Roots of Polynomials, Part I & II, Elsevier, Kidlington, Oxford 2009

' Algorithm to find complex roots of a polynomial with complex coefficients:

'   // define MachEps machine epsilon precision, pow(FLT_RADIX, -DBL_MANT_DIG + 1) for double precision

'   #define MACHEPS 1E-16

'   // squared magnitude of complex z

'   #define cnorm(z) (creal(z) * creal(z) + cimag(z) * cimag(z))

'   void Laguerre(int n, const complex double a[], complex double z[]) {

'     int i, m;

'     double e, t;

'     complex double *b, p, p1, p2, r, h, g, u, v;

'     // find the trailing zero coefficients and remove them

'     for (m = n; m > 1 && a[m] == 0; --m)

'       z[m-1] = 0;

'     // leading coefficient cannot be zero

'     if (m < 1 || a[0] == 0)

'       return -1;

'     // copy coefficients to deflate the polynomial

'     b = (complex double*)malloc((m + 1) * sizeof(complex double));

'     memcpy(b, a, (m + 1) * sizeof(complex double));

'     // find the complex roots z[m-1] to z[2]

'     while (m > 2) {

'       // pick a suitable stating point for the root search

'       r = sqrt(0.5);

'       if (b[m-1] != 0)

'         r *= -b[m] / b[m-1];

'       // my novel idea (invention) to improve the complex root starting point for faster convergence

'       if (cimag(r) == 0)

'         r = CMPLX(creal(r), creal(r)/m);

'       // root modulus error tolerance

'       e = 6 * m * cabs(b[m]) * MACHEPS;

'       // loop to compute root z[m-1]

'       do {

'         // compute p(z), p'(z), p"(z) using Horner's rule

'         p = b[0];

'         p1 = 0;

'         p2 = 0;

'         for (i = 1; i <= m; ++i) {

'           p2 = p2 * r + p1;

'           p1 = p1 * r + p;

'           p = p * r + b[i];

'         }

'         // if |p(z)| is (close to) zero then stop searching; using triangle inequality to prevent overflow

'         if (fabs(creal(p)) < e && fabs(cimag(p)) < e && cnorm(p) < e*e)

'           break;

'         // compute G = p'(z)/p(z), H = G^2-p"(z)/p(z)

'         g = p1/p;

'         u = g*g;

'         h = u - p2/p;

'         // compute h = m/(G+/-sqrt((m-1)*(m*H-G^2))) with maximal denominator magnitude

'         v = csqrt((m-1) * (m*h - u));

'         u = g+v;

'         v = g-v;

'         if (cnorm(u) < cnorm(v))

'           u = v;

'         h = m/u;

'         r -= h;

'       }

'       while (cnorm(h) > (cnorm(r) + 1) * MACHEPS*MACHEPS);

'       if (fabs(cimag(r)) < MACHEPS) {

'         // |z.im| < MachEps, consider root is real

'         r = creal(r);

'       }

'       else {

'         // compute p(z.re) using Horner's rule

'         t = cnorm(p);

'         p = b[0];

'         for (i = 1; i <= m; ++i)

'           p = p * creal(r) + b[i];

'         // if |p(z.re)| <= |p(z)| then root is real

'         if (cnorm(p) <= t)

'           r = creal(r);

'       }

'       // save the complex root

'       z[m-1] = r;

'       // forward deflate the polynomial

'       v = b[0];

'       for (i = 1; i < m; ++i)

'         v = b[i] += v * r;

'       --m;

'     }

'     // compute the roots for the remaining m=1 and m=2 cases

'     if (m == 1) {

'       // linear case

'       z[0] = -b[1] / b[0];

'     }

'     else if (b[1] == 0) {

'       // quadratic case when b=0

'       z[0] = csqrt(-b[2] / b[0]);

'       z[1] = -z[0];

'     }

'     else {

'       // quadratic formula without potential catastrophic cancellation

'       v = b[1] * b[1] - 4 * b[0] * b[2];

'       if (creal(v) <= 0 || creal(b[1]) < 0)

'         z[0] = (-b[1] + csqrt(v)) / (2 * b[0]);

'       else

'         z[0] = (2 * b[2]) / (-b[1] - csqrt(v));

'       z[1] = b[2] / (b[0] * z[0]);

'     }

'     free(b);

'   }

'   

' Algorithm to find complex roots of a polynomial with real coefficients, using separate real and imaginary variables:

'   // define MachEps machine epsilon precision, pow(FLT_RADIX, -DBL_MANT_DIG + 1) for double precision

'   #define MACHEPS 1E-16

'   // squared magnitude of complex (zr,zi)

'   #define norm(zr, zi) ((zr) * (zr) + (zi) * (zi))

'   void laguerre(int n, const double a[], double zr[], double zi[]) {

'     int i, m;

'     double *b, e, k1, k2, pr, pi, pr1, pi1, pr2, pi2, rr, ri, hr, hi, gr, gi, ur, ui, vr, vi;

'     // find the trailing zero coefficients and remove them

'     for (m = n; m > 1 && a[m] == 0; --m)

'       zr[m-1] = zi[n-1] = 0;

'     // leading coefficient cannot be zero

'     if (m < 1 || a[0] == 0)

'       return -1;

'     // copy coefficients to deflate the polynomial

'     b = (double*)malloc((m + 1) * sizeof(double));

'     memcpy(b, a, (m + 1) * sizeof(double));

'     // loop to find the complex roots z[m-1] to z[2]

'     while (m > 2) {

'       // pick a suitable stating point for the root search

'       rr = sqrt(0.5);

'       if (b[m-1] != 0)

'         rr *= -b[m] / b[m-1];

'       // my novel idea (invention) to improve the complex root starting point for faster convergence

'       ri = rr/m;

'       // root modulus error tolerance

'       e = 6 * m * fabs(b[m]) * MACHEPS;

'       // loop to compute root z[m-1]

'       do {

'         // compute p(z), p'(z), p"(z) using Horner's rule

'         pr = b[0];

'         pi = 0;

'         pr1 = 0;

'         pi1 = 0;

'         pr2 = 0;

'         pi2 = 0;

'         for (i = 1; i <= m; ++i) {

'           // compute p2 = p2 * r + p1

'           k1 = pr2;

'           pr2 = pr2 * rr - pi2 * ri + pr1;

'           pi2 = pi2 * rr + k1  * ri + pi1;

'           // compute p1 = p1 * r + p

'           k1 = pr1;

'           pr1 = pr1 * rr - pi1 * ri + pr;

'           pi1 = pi1 * rr + k1  * ri + pi;

'           // compute p = p * r + b[i]

'           k1 = pr;

'           pr = pr * rr - pi * ri + b[i];

'           pi = pi * rr + k1 * ri;

'         }

'         // if |p(z)| is (close to) zero then stop searching; using triangle inequality to prevent overflow

'         if (fabs(pr) < e && fabs(pi) < e && norm(pr, pi) < e*e)

'           break;

'         // compute G = p'(z)/p(z) and p"(z)/p(z)

'         if (fabs(pr) >= fabs(pi)) {

'           k1 = pi/pr;

'           k2 = k1*pi + pr;

'           gr = (pr1 + k1*pi1) / k2;

'           gi = (pi1 - k1*pr1) / k2;

'           hr = (pr2 + k1*pi2) / k2;

'           hi = (pi2 - k1*pr2) / k2;

'         }

'         else {

'           k1 = pr/pi;

'           k2 = k1*pr + pi;

'           gr = (k1*pr1 + pi1) / k2;

'           gi = (k1*pi1 - pr1) / k2;

'           hr = (k1*pr2 + pi2) / k2;

'           hi = (k1*pi2 - pr2) / k2;

'         }

'         // compute G^2

'         ur = gr*gr - gi*gi;

'         ui = 2*gr*gi;

'         // compute H = G^2-p"(z)/p(z)

'         hr = ur-hr;

'         hi = ui-hi;

'         // compute v = sqrt((m-1)*(m*H-G^2)) without potential intermediate overflow

'         hr = (m-1) * (m*hr - ur);

'         hi = (m-1) * (m*hi - ui);

'         if (hi == 0) {

'           if (hr < 0) {

'             hi = sqrt(-hr);

'             hr = 0;

'           }

'           else {

'             hr = sqrt(hr);

'           }

'         }

'         else {

'           if (fabs(hr) < fabs(hi)) {

'             k1 = hr/hi;

'             k1 = fabs(hi) * sqrt(1 + k1*k1);

'           }

'           else {

'             k1 = hi/hr;

'             k1 = fabs(hr) * sqrt(1 + k1*k1);

'           }

'           vr = sqrt((k1+hr) / 2);

'           vi = copysign(sqrt((k1-hr) / 2), hi);

'         }

'         // compute h = m/(G+/-sqrt((m-1)*(m*H-G^2))) with maximal denominator magnitude

'         ur = gr+vr;

'         ui = gi+vi;

'         vr = gr-vr;

'         vi = gi-vi;

'         if (norm(ur, ui) < norm(vr, vi)) {

'           ur = vr;

'           ui = vi;

'         }

'         if (fabs(ur) >= fabs(ui)) {

'           k1 = ui/ur;

'           k2 = k1*ui + ur;

'           hr = m / k2;

'           hi = -k1*m / k2;

'         }

'         else {

'           k1 = ur/ui;

'           k2 = k1*ur + ui;

'           hr = k1*m / k2;

'           hi = -m / k2;

'         }

'         rr -= hr;

'         ri -= hi;

'       }

'       while (norm(hr, hi) > (norm(rr, ri) + 1) * MACHEPS*MACHEPS);

'       // if |z.im| < MachEps then root is real

'       if (fabs(ri) < MACHEPS) {

'         ri = 0;

'       }

'       else {

'         // compute p(z.re) using Horner's rule

'         k1 = b[0];

'         for (i = 1; i <= m; ++i)

'           k1 = k1 * rr + b[i];

'         // if |p(z.re)| <= |p(z)| then root is real

'         if (k1*k1 <= norm(pr, pi))

'           ri = 0;

'       }

'       // save the complex root

'       zr[m-1] = rr;

'       zi[m-1] = ri;

'       if (ri == 0) {

'         // root is real; deflate the polynomial

'         k1 = b[0];

'         for (i = 1; i < m; ++i)

'           k1 = b[i] += k1 * rr;

'       }

'       else {

'         // root is complex; add conjugate root and deflate by quadratic

'         --m;

'         zr[m-1] = rr;

'         zi[m-1] = -ri;

'         k1 = -2 * rr;

'         k2 = rr*rr + ri*ri;

'         b[1] -= k1 * b[0];

'         for (i = 2; i < m; ++i)

'           b[i] -= k1 * b[i-1] + k2 * b[i-2];

'       }

'       --m;

'     }

'     // compute the roots for the remaining m=1 and m=2 cases

'     if (m == 1) {

'       // linear case

'       zr[0] = -b[1] / b[0];

'       zi[0] = 0;

'     }

'     else if (b[1] == 0) {

'       // quadratic case when b=0

'       k1 = -b[2] / b[0];

'       if (k1 > 0) {

'         zr[0] = sqrt(k1);

'         zi[0] = 0;

'         zr[1] = -zr[0];

'         zi[1] = 0;

'       }

'       else {

'         zr[0] = 0;

'         zi[0] = sqrt(-k1);

'         zr[1] = 0;

'         zi[1] = -zi[0];

'       }

'     }

'     else {

'       // quadratic formula without potential catastrophic cancellation

'       k1 = b[1] * b[1] - 4 * b[0] * b[2];

'       if (k1 >= 0) {

'         if (b[1] > 0) {

'           k2 = -b[1] - sqrt(k1);

'           zr[0] = 2 * b[2] / k2;

'           zi[0] = 0;

'           zr[1] = 0.5 * k2 / b[0];

'           zi[1] = 0;

'         }

'         else {

'           k2 = -b[1] + sqrt(k1);

'           zr[0] = 0.5 * k2 / b[0];

'           zi[0] = 0;

'           zr[1] = 2 * b[2] / k2;

'           zi[1] = 0;

'         }

'       }

'       else {

'         zr[0] = -0.5 * b[1] / b[0];

'         zi[0] = 0.5 * sqrt(-k1) / fabs(b[0]);

'         zr[1] = zr[0];

'         zi[1] = -zi[0];

'       }

'     }

'     free(b);

'   }

' VARIABLES

'  N              degree

'  A(27..27+N)    N+1 coefficients indexed from highest to lowest

'  A(28+N..28+2N) N+1 copied coefficients used in deflation

'  A(O..O+N-1)    N roots real parts

'  A(Z..Z+N-1)    N roots imag parts

'  O              array offset in A() with the real parts of the roots

'  Z              array offset in A() with the imag parts of the roots

'  I              loop iterator

'  L              array offset in A() with copied coefficients used in deflation

'  M              internal adjusted degree M<=N

'  A,B            real and imag parts of root z

'  P,Q            real and imag parts of p(z)

'  C,D            real and imag parts of p'(z)

'  R,S            real and imag parts of p"(z)

'  E              eps = 6*m*|a[m]|*MachEps

'  F,G            real and imag parts of g

'  H,K            real and imag parts of h

'  U,X            real and imag parts of u

'  V,Y            real and imag parts of v

'  T,W            scratch

' driver program

10 "A" CLS: WAIT 0: INPUT "N=";N: IF N<2 GOTO 10

20 FOR I=0 TO N: CLS: PRINT "A";STR$(N-I);"=";A(27+I): CURSOR 24: INPUT A(27+I)

30 NEXT I

40 CLS: PRINT "CALCULATING...": GOSUB "LAG"

50 PRINT "RE","IM": WAIT

60 FOR I=0 TO N-1: PRINT A(O+I),A(Z+I): NEXT I

70 END

' polynomial root finder: N degree, A(27..27+N) coefficients

100 "LAG" L=28+N,O=L+N+1,Z=O+N,M=N

' optimized: remove leading zero coefficients (trailing A)

110 IF M>1 IF A(27+M)=0 LET M=M-1,A(O+M)=0,A(Z+M)=0: GOTO 110

120 IF M<1 OR A(27)=0 RETURN

' copy coefficients to deflate the polynomial

130 FOR I=0 TO M: A(L+I)=A(27+I): NEXT I

' loop to find the complex roots z[m-1] to z[2]

140 IF M<3 GOTO 440

150 FOR M=M TO 3 STEP -1: CURSOR 24: PRINT "ROOT ";M

160 A=SQR .5: IF A(L+M-1)<>0 LET A=-A*A(L+M)/A(L+M-1)

170 B=A/M,E=6E-10*M*ABS A(L+M)

' loop to compute root z[m-1]

' compute p(z), p'(z), p"(z) using Horner's rule

180 P=A(L),Q=0,C=0,D=0,R=0,S=0

190 FOR I=1 TO M

200 T=R,R=R*A-S*B+C,S=S*A+T*B+D,T=C,C=C*A-D*B+P,D=D*A+T*B+Q

210 T=P,P=P*A-Q*B+A(L+I),Q=Q*A+T*B

220 NEXT I

' if |p(z)| is (close to) zero then stop searching; using triangle inequality to prevent overflow

230 IF ABS P<E IF ABS Q<E IF P*P+Q*Q < E*E GOTO 360

' compute G = p'(z)/p(z) and p"(z)/p(z)

240 IF ABS P<ABS Q GOTO 255

250 T=Q/P,W=T*Q+P,F=(C+T*D)/W,G=(D-T*C)/W,H=(R+T*S)/W,K=(S-T*R)/W: GOTO 260

' note: assignments moved from line 240 (after the IF) to line 255 with GOTO 260 removed, to make it fit on some Sharp PC

255 T=P/Q,W=T*P+Q,F=(T*C+D)/W,G=(T*D-C)/W,H=(T*R+S)/W,K=(T*S-R)/W

' compute G^2 and H = G^2-p"(z)/p(z) and v = sqrt((m-1)*(m*H-G^2)) without potential intermediate overflow

260 U=F*F-G*G,X=2*F*G,H=U-H,K=X-K,H=(M-1)*(M*H-U),K=(M-1)*(M*K-X)

270 IF K=0 LET T=SQR ABS H,H=(H>0)*T,K=(H<0)*T: GOTO 310

280 IF ABS H<ABS K LET T=H/K,T=ABS K*SQR(1+T*T): GOTO 300

290 T=K/H,T=ABS H*SQR(1+T*T)

300 V=SQR(.5*(T+H)),Y=SGN K*SQR(.5*(T-H))

' compute h = m/(G+/-sqrt((m-1)*(m*H-G^2))) with maximal denominator magnitude

310 U=F+V,X=G+Y,V=F-V,Y=G-Y: IF U*U+X*X<V*V+Y*Y LET U=V,X=Y

320 IF ABS U<ABS X LET T=U/X,W=T*U+X,H=T*M/W,K=-M/W: GOTO 340

330 T=X/U,W=T*X+U,H=M/W,K=-T*M/W

340 A=A-H,B=B-K

' loop while |h|^2 > (|r|^2 + 1) * MachEps

350 IF H*H+K*K>(A*A+B*B+1)*1E-20 GOTO 180

' if |z.im| < MachEps then root is real

360 IF ABS B<1E-10 LET B=0: GOTO 390

' compute p(z.re) using Horner's rule

370 T=A(L): FOR I=1 TO M: T=T*A+A(L+I): NEXT I

' if |p(z.re)| <= |p(z)| then root is real

380 IF T*T<=P*P+Q*Q LET B=0

390 A(O+M-1)=A,A(Z+M-1)=B

' if root is real then deflate the polynomial

400 IF B=0 LET T=A(L): FOR I=1 TO M-1: T=A(L+I)+T*A,A(L+I)=T: NEXT I: GOTO 430

' root is complex; add conjugate root and deflate by quadratic

410 M=M-1:A(O+M-1)=A,A(Z+M-1)=-B,T=-2*A,W=A*A+B*B,A(L+1)=A(L+1)-T*A(L)

420 FOR I=2 TO M-1: A(L+I)=A(L+I)-T*A(L+I-1)-W*A(L+I-2): NEXT I

430 NEXT M

' compute the roots for the remaining m=1 and m=2 cases

440 IF M=1 LET A(O)=-A(L+1)/A(L),A(Z)=0: RETURN

450 IF A(L+1)<>0 GOTO 480

460 T=-A(L+2)/A(L): IF T>0 LET A(O)=SQR T,A(Z)=0,A(O+1)=-A(O),A(Z+1)=0: RETURN

470 A(O)=0,A(Z)=SQR -T,A(O+1)=0,A(Z+1)=-A(Z): RETURN

' quadratic formula without potential catastrophic cancellation

480 T=A(L+1),W=T*T-4*A(L)*A(L+2)

490 IF W<0 LET A(O)=-.5*T/A(L),A(Z)=.5*SQR(-W)/ABS A(L),A(O+1)=A(O),A(Z+1)=-A(Z): RETURN

500 IF T>0 LET W=-T-SQR W,A(O)=2*A(L+2)/W,A(Z)=0,A(O+1)=.5*W/A(L),A(Z+1)=0: RETURN

510 W=-T+SQR W,A(O)=.5*W/A(L),A(Z)=0,A(O+1)=2*A(L+2)/W,A(Z+1)=0: RETURN

robve · 03-11-2022, 02:11 AM

Shown below is a comparison of Householder 3rd order method (4th order convergence) to the Laguerre method (3rd order convergence) for solving polynomials with complex coefficients and real coefficients. For each of the four given MachEps values, the total number of iterations to converge the roots is shown with the sum or residual errors. Note that the last root or last two roots are computed directly, hence the polynomials shown in red font require 0 iterations. Also note that the residual errors are expected to typically larger for polynomials with large coefficients, since a small change in the root z may cause a large change in p(z). Finally, note that the 3rd order polynomials are easy to solve. These are included to determine the performance and stability of the method to find a single root for a range of cases.

[Image: laguerre.png]

See also the Householder forum post, which includes details on the parameters used in the implementations.

Quattro · 08-21-2024, 05:51 PM

After the first derivative, factorial constants should come in:

p'(z) = p1
p"(z) = 2 * p2

When these constants are taken into account, the number of iterations to converge will be significantly reduced.

In the implementations which use separate real and imaginary variables, vr and vi (or V and Y in line 310) are not updated when hi (or K in line 270) equals zero and so the previous values are used.

A better way to compute the real part of G^2 is ur = (gr-gi) * (gr+gi) (or U = (F-G) * (F+G) in line 260).

The maximal denominator magnitude can be simply computed as:

if gr*vr < -gi*vi {ur = gr-vr; ui = gi-vi;} else {ur = gr+vr; ui = gi+vi;}
(or in line 310 IF F*V<-G*Y LET U=F-V,X=G-Y: GOTO 320 newline 315 U=F+V,X=G+Y).