Threaded Mode | Linear Mode

robve · (This post was last modified: 03-10-2022 02:38 PM by robve.)

The Aberth method version of the Durand-Kerner program. I've added my tested and verified C version for comparison and commented the code. Perhaps someone finds this useful.

ABERTH POLYNOMIAL ROOT FINDER
Aberth method: Durand-Kerner with Newton's method, converges generally faster (not always)
Estimate the complex roots of a polynomial
Note: Durand-Kerner is not guaranteed to converge
3rd order convergence, worst-case linear convergence for roots with multiplicities

Optimized implementations in C and BASIC:
- Finds complex roots z of polynomial p such that |p(z)|<T given tolerance T
- Prevents overflow/nan/inf by bounding the root shift adjustment magnitude during polishing
- Prevents loss of precision when possible, e.g. in reciprocals
- Tested with a large sample of polynomials

O. Aberth, Iteration Methods for finding all zeros of a Polynomial Simultaneously, Mathematics Computation, Vol 27, Number 122, April 1973
https://en.wikipedia.org/wiki/Durand–Kerner_method
https://en.wikipedia.org/wiki/Aberth_method
W.H. Press et al. Numerical Recipes FORTRAN 2ed
David Goldberg. What Every Computer Scientist Should Know About Floating Point, ACM Computing Surveys, 1991

For SHARP PC-12xx to 14xx

To use with SHARP PC models with a single display line:
- remove CURSOR and CLS
- replace PRINT with PAUSE except at line 80

1098 bytes BASIC image (PC-1350)

Degree: max polynomial degree limited by time and memory only
Tolerance: T=1E-6 (adjustable)
Levels: up to L=20 iterations (adjustable)

Example:

(Solve x^2+2x+3=0)
[DEF-A]
N=2 A2=1 A1=2 A0=3
T= L=
CALCULATING...
RE          IM
         -1.  -1.414214.
         -1.   1.414214.

(Solve x^3-2x^2-x+2=0)
[DEF-A]
N=3 A3=1 A2=-2 A1=-1 A0=2
T= L=
CALCULATING...
RE          IM
          1.          0.
          2.          0.
         -1.          0.

(Solve x^3+3x^2+x+3=0)
[DEF-A]
N=3 A3=1 A2=3 A1=1 A0=3
T= L=
CALCULATING...
RE          IM
          0.         -1.
          0.          1.
         -3.          0.

(Solve x^4+2999x^3-10003e3x^2-2399e7x+24e9=0)
[DEF-A]
N=4 A4=1 A3=2999 A2=-10003E3 A1=-2399E7 A0=24E9
T= L=
CALCULATING...
RE          IM
          1.          0.
       3000.          0.
      -4000.          0.
      -2000.          0.

(Solve 5x^6-45x^5+225x^4-425x^3+170x^2+370x-500=0)
[DEF-A]
N=6 A6=5 A5=-45 A4=225 A3=-425 A2=170 A1=370 A0=-500
T= L=
CALCULATING...
RE          IM
          2.          0.
          1.          1.
          3.          4.
         -1.          0.
          3.         -4.
          1.         -1.

Code:

' ABERTH POLYNOMIAL ROOT FINDER

' Aberth method: Durand-Kerner with Newton's method, converges generally faster (not always)

' Estimate the complex roots of a polynomial

' 3rd order convergence, worst-case linear convergence for roots with multiplicities

'

' For SHARP PC-12xx to 14xx by Robert van Engelen

' To use with SHARP PC models with a single display line:

' - remove CURSOR and CLS

' - replace PRINT with PAUSE except at line 80

'

' Optimized implementations in C and BASIC:

' - Finds complex roots z of polynomial p such that |p(z)|<T given tolerance T

' - Prevents overflow/nan/inf by bounding the root shift adjustment magnitude during polishing

' - Prevents loss of precision when possible, e.g. in reciprocals

' - Tested with a large sample of polynomials

'

' https://en.wikipedia.org/wiki/Durand–Kerner_method and https://en.wikipedia.org/wiki/Aberth_method

' W.H. Press et al. Numerical Recipes FORTRAN 2ed

' David Goldberg. What Every Computer Scientist Should Know About Floating Point, ACM Computing Surveys, 1991

' Algorithm to find complex double precision roots of a polynomial with complex coefficients:

'   int ADK(int n, const complex double a[], complex double z[], int maxit, double toler) {

'     int i, j, k, m;

'     complex double r, p, q, d, f;

'     double err, max, h, nprec, dprec;

'     if (maxit <= 0)

'       maxit = 100;

'     if (toler <= 0 || toler >= 1)

'       toler = 1e-8;

'     // optimize by removing trailing zero coefficients

'     for (m = n; m > 1 && a[m] == 0; --m)

'       z[m-1] = 0;

'     // leading coefficient cannot be zero

'     if (m < 1 || a[0] == 0)

'       return -1; // error

'     // generate initial roots

'     for (i = 1; i <= m; ++i)

'       z[i-1] = cpow(CMPLX(.4, .9), i-1);

'     // max root stride modulus permitted to prevent overflow

'     max = pow(DBL_MAX, 1./m);

'     if (max < 1)

'       max = 1;

'     // Aberth-Ehrlich root polishing

'     for (k = 1; k <= maxit; ++k) {

'       err = 0;

'       for (i = 1; i <= m; ++i) {

'         // compute p(z) and derivative p'(z) using Horner's rule

'         p = a[0];

'         q = 0;

'         r = z[i-1];

'         for (j = 1; j <= m; ++j) {

'           q = q * r + p;

'           p = p * r + a[j];

'         }

'         // if p(z) = 0 within acceptable tolerance, then do not update

'         if (cabs(p) <= toler)

'           continue;

'         // compute denominator d=q-sum(i!=j,p/(z[i]-z[j])) of (p/q)/(1-p/q*sum(i!=j,1/(z[i]-z[j]))) => p/(q-sum(i!=j,p/(z[i]-z[j])))

'         d = 0;

'         for (j = 1; j <= m; ++j) {

'           if (i != j) {

'             f = r - z[j-1];

'             if (cabs(f) >= toler)

'               d += p/f;

'           }

'         }

'         d = q - d;

'         // compute p/d if d is not too small to prevent overflow

'         if (cabs(d) > toler)

'           p /= d;

'         h = cabs(p);

'         err += h;

'         // polish root if its shift adjustment is within floating point bounds to prevent overflow

'         if (h < max)

'           z[i-1] = r - p;

'       }

'       // check if convergence error is below our tolerance threshold

'       if (err <= toler)

'         break;

'     }

'     // round results (optional)

'     nprec = toler;

'     dprec = 1/toler;

'     for (i = 1; i <= m; ++i) {

'       r = z[i-1];

'       z[i-1] = nprec*CMPLX(round(creal(r)*dprec), round(cimag(r)*dprec));

'     }

'     return k; // number of iterations >0

'   }

'

' Algorithm to find complex roots of a polynomial with real coefficients, using separate real and imaginary variables:

'   int ADK(int n, const double a[], double zr[], double zi[], int maxit, double toler) {

'     int i, j, k, m;

'     double rr, pr, dr;

'     double ri, pi, di;

'     double s1, s2, k1, k2, k3;

'     double err2, max2, tol2; // err, max, toler squared

'     double h, t, nprec, dprec;

'     const double zh = sqrt(.97), zt = atan(2.25);

'     if (maxit <= 0)

'       maxit = 100;

'     if (toler <= 0 || toler >= 1)

'       toler = 1e-8;

'     tol2 = toler * toler;

'     // optimize by removing trailing zero coefficients

'     for (m = n; m > 1 && a[m] == 0; --m)

'       zr[m-1] = zi[m-1] = 0;

'     // leading coefficient cannot be zero

'     if (m < 1 || a[0] == 0)

'       return -1; // error

'     // generate initial roots z[i]=(.4+.9I)^i for i=0 to m-1

'     h = zh;

'     t = zt;

'     zr[0] = 1;

'     zi[0] = 0;

'     for (i = 2; i <= m; ++i) {

'       zr[i-1] = h * cos(t);

'       zi[i-1] = h * sin(t);

'       h *= zh;

'       t += zt;

'     }

'     // square of max root stride modulus permitted to prevent overflow

'     max2 = m > 2 ? pow(DBL_MAX, 2./m) : DBL_MAX;

'     if (max2 < 1)

'       max2 = 1;

'     // Aberth-Ehrlich root polishing

'     for (k = 1; k <= maxit; ++k) {

'       err2 = 0;

'       for (i = 1; i <= m; ++i) {

'         // compute p(z) and derivative p'(z) using Horner's rule

'         pr = a[0];

'         pi = 0;

'         qr = 0;

'         qi = 0;

'         rr = zr[i-1];

'         ri = zi[i-1];

'         for (j = 1; j <= m; ++j) {

'           // compute q = q*r+p

'           k1 = qr;

'           qr = qr * rr - qi * ri + pr;

'           qi = qi * rr + k1 * ri + pi;

'           // compute p = p*r+a[j]

'           k1 = pr;

'           pr = pr * rr - pi * ri + a[j];

'           pi = pi * rr + k1 * ri;

'         }

'         // if p(z) = 0 within acceptable tolerance, then do not update

'         if (pr * pr + pi * pi <= tol2)

'           continue;

'         // compute denominator d=q-sum(i!=j,p/(z[i]-z[j])) of (p/q)/(1-p/q*sum(i!=j,1/(z[i]-z[j]))) => p/(q-sum(i!=j,p/(z[i]-z[j])))

'         dr = 0;

'         di = 0;

'         for (j = 1; j <= m; ++j) {

'           if (i != j) {

'             // compute d = d + 1/(r - z[j])

'             s1 = rr - zr[j-1];

'             s2 = ri - zi[j-1];

'             if (s1 * s1 + s2 * s2 >= tol2) {

'               if (fabs(s1) < fabs(s2)) {

'                 k1 = s1/s2;

'                 k2 = k1 * s1 + s2;

'                 s1 = (pr * k1 + pi)/k2;

'                 s2 = (pi * k1 - pr)/k2;

'               }

'               else {

'                 k1 = s2/s1;

'                 k2 = s1 + k1 * s2;

'                 s1 = (pr + pi * k1)/k2;

'                 s2 = (pi - pr * k1)/k2;

'               }

'               dr += s1;

'               di += s2;

'             }

'           }

'         }

'         dr = qr - dr;

'         di = qi - di;

'         // compute p/d if d is not too small to prevent overflow

'         if (dr * dr + di * di > tol2) {

'           // compute p/d without loss of precision

'           if (fabs(dr) >= fabs(di)) {

'             k1 = di/dr;

'             k2 = k1*di + dr;

'             k3 = pr;

'             pr = (k1*pi + pr)/k2;

'             pi = (pi - k1*k3)/k2;

'           }

'           else {

'             k1 = dr/di;

'             k2 = k1*dr + di;

'             k3 = pr;

'             pr = (k1*pr + pi)/k2;

'             pi = (k1*pi - k3)/k2;

'           }

'         }

'         // polish root if its shift adjustment is within floating point bounds to prevent overflow

'         h = pr * pr + pi * pi;

'         err2 += h;

'         if (h < max2) {

'           zr[i-1] = rr - pr;

'           zi[i-1] = ri - pi;

'         }

'       }

'       // check if convergence error is below our tolerance threshold

'       if (err2 <= tol2)

'         break;

'     }

'     // round results (optional)

'     nprec = toler;

'     dprec = 1/toler;

'     for (i = 1; i <= m; ++i) {

'       zr[i-1] = nprec*round(zr[i-1]*dprec);

'       zi[i-1] = nprec*round(zi[i-1]*dprec);

'     }

'     return k; // number of iterations >0

'   }

' VARIABLES

'  N           degree

'  T           tolerance (1E-6)

'  L           max iterations (20), updated to the iterations executed

'  A(27..27+N) N+1 coefficients indexed from highest to lowest

'  A(O..O+N-1) N roots real parts

'  A(Z..Z+N-1) N roots imag parts

'  E           estimated error of the roots squared

'  O           array offset in A() with the real parts of the roots, O=28+N

'  Z           array offset in A() with the imag parts of the roots, Z=O+N

'  S           tolerance T squared

'  R           max root stride modulus squared

'  I,J,K       loop iterators

'  P,Q         real and imag parts of p(z)

'  F,G         real and imag parts of p'(z)

'  A,B,C,D     scratch

'  U,V,W,X,Y   scratch

' driver program

10 "A" CLS: WAIT 0: INPUT "N=";N: IF N<2 GOTO 10

20 FOR I=0 TO N: CLS: PRINT "A";STR$(N-I);"=";A(27+I): CURSOR 24: INPUT A(27+I)

30 NEXT I

40 CURSOR: T=1E-6,L=20: INPUT "T=";T

50 INPUT "L=";L

60 CLS: PRINT "CALCULATING...": O=28+N: GOSUB "ADK"

70 CURSOR: PRINT "RE","IM": WAIT

80 FOR I=0 TO N-1: PRINT A(O+I),A(Z+I): NEXT I

90 END

' polynomial root finder: N degree, T tolerance, L max iters, A(27..27+N) coefficients

100 "ADK" S=T*T,O=28+N,Z=O+N,M=N

' optimize: remove leading zero coefficients (trailing A)

110 IF M>1 IF A(27+M)=0 LET M=M-1,A(O+M)=0,A(Z+M)=0: GOTO 110

120 IF M<1 OR A(27)=0 RETURN

' generate initial roots, square of max root shift distance bounded to prevent overflow

130 RADIAN: X=SQR .97,Y=ATN 2.25,U=X,V=Y,A(O)=1,A(Z)=0,R=1E99: IF M>2 LET R=R^(2/M): IF R<1 LET R=1

140 IF M>1 FOR I=1 TO M-1: A(O+I)=U*COS V,A(Z+I)=U*SIN V,U=U*X,V=V+Y: NEXT I

' iterative root polishing until convergence or max iterations reached

150 FOR K=1 TO L: E=0

160 FOR I=0 TO M-1

' compute p(z) and p'(z) using Horner's rule

170 A=A(O+I),B=A(Z+I),C=0,D=0,F=0,G=0,P=A(27),Q=0

180 FOR J=0 TO M-1: U=F,F=F*A-G*B+P,G=G*A+U*B+Q,U=P,P=P*A-Q*B+A(28+J),Q=Q*A+U*B: NEXT J

' optimized: if |p(z)|<tolerance then do not update root z

190 IF P*P+Q*Q<=S GOTO 310

' compute denominator

200 FOR J=0 TO M-1: IF I=J GOTO 250

210 X=A-A(O+J),Y=B-A(Z+J)

220 IF X*X+Y*Y>=S LET U=X/Y,V=U*X+Y,X=(P*U+Q)/V,Y=(Q*U-P)/V: GOTO 240

230 U=Y/X,V=X+U*Y,X=(P+Q*U)/V,Y=(Q-P*U)/V

240 C=C+X,D=D+Y

250 NEXT J

260 C=F-C,D=G-D

' compute p/d if d is not too small to prevent overflow

270 X=C*C+D*D: IF X<=S GOTO 300

280 IF ABS C>=ABS D LET U=D/C,V=U*D+C,W=P,P=(U*Q+P)/V,Q=(Q-U*W)/V: GOTO 300

290 U=C/D,V=U*C+D,W=P,P=(U*P+Q)/V,Q=(U*Q-W)/V

' polish root if its shift adjustment is within max bounds to prevent overflow

300 X=P*P+Q*Q,E=E+X: IF X<R LET A(O+I)=A-P,A(Z+I)=B-Q

310 NEXT I

320 CURSOR 24: PRINT "K=";STR$ K;":E=";SQR E: IF E<=S LET U=L,L=K,K=U

330 NEXT K

' round results (optional)

340 U=1/T

350 FOR I=0 TO M-1: A(O+I)=T*INT(U*A(O+I)+.5),A(Z+I)=T*INT(U*A(Z+I)+.5): NEXT I

' return error estimate

360 E=(E=0)*T+SQR E: RETURN

robve · (This post was last modified: 02-22-2022 02:17 AM by robve.)

Shown below is a basic performance comparison of Aberth to Weierstrass/Durand-Kerner using 24 polynomials, four of which have roots with multiplicities. The number of iterations to converge to the given tolerance is shown. The standard (std) Durand-Kerner may produce NaN for polynomials with roots with multiplicities or does not converge. The improved/optimized C version (WDK) of Weierstrass/Durand-Kerner almost always converges regardless. Note that the improved/optimized Aberth version of Durand-Kerner (ADK) generally converges slightly faster. Tighter tolerances may cause convergence to be slow (linear) or even fail when multiplicities are present, a known weakness of Durand-Kerner. Quadruple precision may succeed to meet tighter tolerances (not shown).

[Image: proot.png]