04-17-2021, 01:51 PM (This post was last modified: 04-17-2021 09:16 PM by robve.)
Post: #61
 robve Member Posts: 140 Joined: Sep 2020
(04-17-2021 12:03 AM)Albert Chan Wrote:  The plot showed a nice bell-shaped curve, easy to get area, simply by summing squares.

Rather than doing that, I think we can do better to quickly estimate an optimal d using a geometric sequence of points. I ran a few tests with this change to 'quad' for Exp-Sinh:

Code:
  else if (isfinite(a)) {     int i;     mode = 1; // Exp-Sinh     for (i = 1; i <= 12; ++i) {       v = 1 << i;       printf("%d diff=%g\n", 1 << i, f(a + d/v)/v - f(a + v*d)*v);     }     c = a;     v = a+d;   }

Note what happens to 'diff' for these integrands:

exp(-.2*x) on [0,inf) d~45 is good
exp(-.1*x) on [0,inf) d~100 is good
exp(-.01*x) on [0,inf) d~700 is good
exp(-.001*x) on [0,inf) d~9000 is good
1/(x*x) on [1,inf) d~1 is good
1/(1+x*x) on [0,inf) d~1 is good
log(1+(x*x))/(x*x) on [0,inf] d~1 is good
(x*x*x)/(exp(x)-1) on [0,inf) d~7 is best d~8 is good
x/sinh(x) on [0,inf) d~9 is best d~8 is good
(x*x)/sqrt(exp(x)-1) on [0,inf) d~17 is best d~16 is good
pow(x,7)*exp(-x) on [0,inf) d~25 is best and d~32 is good
exp(-sqrt(x)) on [0,inf) d~81 is best and d~64 is good
exp(-x)*(x*x*x) on [0,inf) d~28 is good
exp(-x)*pow(x,5) on [0,inf) d~50 is good

Note the $$2^i$$ value when the diff sign changes, if it changes.

Periodic functions switch signs multiple times, these are not great to integrate with Exp-Sinh anyway e.g. sin(x)/(x*x) on [0,inf).

The idea of using geometric sequences to determine d aligns with the Exp-Sinh rule. This method is not perfect yet and certainly needs some tweaking.

- Rob

Edit: I checked 208 Exp-Sinh integrals from the benchmarks and they all show that this conceptually works. I've added some more interesting cases to the list.

HP 71B,Prime G2;Ti VOY200,Nspire CXII CAS;Casio fx-CG50,fx-115ES+2;Sharp PC-G850VS,E500S,1475,1450,1360,1350,2500,1262,1500A
04-17-2021, 11:45 PM
Post: #62
 Albert Chan Senior Member Posts: 1,405 Joined: Jul 2018
(04-17-2021 01:51 PM)robve Wrote:  printf("%d diff=%g\n", 1 << i, f(a + d/v)/v - f(a + v*d)*v);
...
Note the 2i value when the diff sign changes, if it changes.

There may be a flaw with this ...

Example, we know exp(-0.01*x) on [0,inf], d ~ 700 is good
But, we can rewrite the integrand as exp(-0.01/x)/x^2, that has optimal d ~ 1/700

With sign checking method, both will suggest optimal d ~ 700.

lua> f = function(x) return exp(-0.01*x) end
lua> g = function(x) return exp(-0.01/x)/(x*x) end
lua> Q.quad(Q.count(f), 0, huge, 700), Q.n
100 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 71
lua> Q.quad(Q.count(g), 0, huge, 1/700), Q.n
100 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 71

If we build g(x) = f(1/x)/x^2, both would use the *same* points, with orders swapped.
(points might have rounding errors with 1/x, but otherwise equivalent.)

The formula is just simple substitution, x = 1/y, dx = -dy/y^2

$$\displaystyle \int_a^∞ f(x)\,dx = \int_0^∞ f(x+a)\,dx = \int_0^∞ {f(1/y+a) \over y^2} dy$$
04-18-2021, 01:30 AM
Post: #63
 robve Member Posts: 140 Joined: Sep 2020
(04-17-2021 11:45 PM)Albert Chan Wrote:  There may be a flaw with this ...

With sign checking method, both will suggest optimal d ~ 700.

Not so fast. In your example the left side's magnitude f(a+d/v)/v is larger than f(a+d*v)*v which means that the sign changes for a different reason. In my test this shows that d should be small (because the test also takes the magnitudes of the difference into account). So your example is not a counter example of my earlier post on why this might work.

Furthermore, as you will notice, the sign change is not always the exact place for an optimal d, but close. The magnitude change is a critical indicator (and more reliable) in the neighborhood of the sign change. So far it works every time.

I was surprised that when I discovered this that for all integrals of the 208 the sign change suggests there is a solid predictor based on this that can be perfected with some tweaking.

- Rob

HP 71B,Prime G2;Ti VOY200,Nspire CXII CAS;Casio fx-CG50,fx-115ES+2;Sharp PC-G850VS,E500S,1475,1450,1360,1350,2500,1262,1500A
04-18-2021, 01:42 AM (This post was last modified: 04-18-2021 02:14 AM by robve.)
Post: #64
 robve Member Posts: 140 Joined: Sep 2020
(04-18-2021 01:30 AM)robve Wrote:  In your example the left side's magnitude f(a+d/v)/v is larger than f(a+d*v)*v which means that the sign changes for a different reason. In my test this shows that d should be small (because the test also takes the magnitudes of the difference into account). So your example is not a counter example of my earlier post on why this might work.

Here are the numbers.

For your example exp(-0.01/x)/x^2 the change is not due to the right side's magnitude, but the left side:
d diff (left-right)
2 diff=2.92578 (3.92079 - 0.995012)
4 diff=14.3751 (15.3726 - 0.997503)
8 diff=58.0807 (59.0794 - 0.998751)
16 diff=217.149 (218.149 - 0.999375)
32 diff=742.577 (743.577 - 0.999688)
64 diff=2158.79 (2159.79 - 0.999844)
128 diff=4554.36 (4555.36 - 0.999922)
256 diff=5065.24 (5066.24 - 0.999961)
512 diff=1565.58 (1566.58 - 0.99998)
1024 diff=36.4476 (37.4476 - 0.99999)
2048 diff=-0.994646 (0.00534945 - 0.999995)

4096 diff=-0.999998 (2.72909e-11 - 0.999998)
8192 diff=-0.999999 (1.77573e-28 - 0.999999)
16384 diff=-0.999999 (1.87946e-63 - 0.999999)
32768 diff=-1 (5.26361e-134 - 1)
65536 diff=-1 (1.03212e-275 - 1)

For exp(-.01*x) the change is due to the right side's magnitude:
d diff (left-right)
2 diff=-2.92578 (0.995012 - 3.92079)
4 diff=-14.3751 (0.997503 - 15.3726)
8 diff=-58.0807 (0.998751 - 59.0794)
16 diff=-217.149 (0.999375 - 218.149)
32 diff=-742.577 (0.999688 - 743.577)
64 diff=-2158.79 (0.999844 - 2159.79)
128 diff=-4554.36 (0.999922 - 4555.36)
256 diff=-5065.24 (0.999961 - 5066.24)
512 diff=-1565.58 (0.99998 - 1566.58)
1024 diff=-36.4476 (0.99999 - 37.4476)
2048 diff=0.994646 (0.999995 - 0.00534945)

4096 diff=0.999998 (0.999998 - 2.72909e-11)
8192 diff=0.999999 (0.999999 - 1.77573e-28)
16384 diff=0.999999 (0.999999 - 1.87946e-63)
32768 diff=1 (1 - 5.26361e-134)
65536 diff=1 (1 - 1.03212e-275)

Part of tweaking the test to get an optimal d is to observe the cliff: there is a sharp drop in magnitude in the right part, from 1566.58 to 37.4476 while the left side is still smooth, followed by a sign change. This suggests an optimal d is somewhere near 512-1024.

Edit: I just now realized I had changed the test earlier today with respect to the magnitude check included in the test as a filter. I had not published this change or said anything about it. To produce the numbers shown above:

Code:
  else if (isfinite(a)) {     int i, dh;     double ph = 0;     dh = d = 1;     mode = 1; // Exp-Sinh     for (i = 1; i <= 16; ++i) {       double r = 1 << i;       double fl = f(a + d/r)/r, fr = f(a + d*r)*r;       double h = r*(fl - fr);       if (dh == d && sign(ph) != sign(h) && fabs(fl) > fabs(fr))         dh = 1 << (i-1);       ph = h;       printf("%g diff=%g (%g - %g)\n", r, h, r*fl, r*fr);     }     if (dh != d)       printf("optimal d=%d\n", dh);

Note that this scales the left fl and right fr by r just to make the value's magnitudes comparable to display, otherwise this is unnecessary. This is not the final version, since this can be further improved.

- Rob

HP 71B,Prime G2;Ti VOY200,Nspire CXII CAS;Casio fx-CG50,fx-115ES+2;Sharp PC-G850VS,E500S,1475,1450,1360,1350,2500,1262,1500A
04-18-2021, 12:56 PM (This post was last modified: 04-19-2021 07:01 PM by robve.)
Post: #65
 robve Member Posts: 140 Joined: Sep 2020
(04-17-2021 11:45 PM)Albert Chan Wrote:  With sign checking method, both will suggest optimal d ~ 700.

Edit: I've updated the document with an improved version of the predictor (shown below) and an explanation how it works.

The following update sets d=1/700 for this case, so d is now either increased or decreased towards an optimum:

Code:
  else if (isfinite(a)) {     int i;     double ph = 0, h;     mode = 1; // Exp-Sinh     for (i = 1; i <= 16; ++i) {       double r = 1 << i;       double fl = f(a + d/r), fr = f(a + d*r)*r*r;       double h = fl - fr;       if (!isfinite(h))         break;       if (i > 1 && sign(ph) != sign(h)) {         if (h != 0)           r /= 2; // r = (r/2*h - r*ph)/(h - ph); // we can also try regula falsi         if (fabs(fl) < fabs(fr))           d /= r;         else           d *= r;         break;       }       if (fabs(h) < 3e-5) // sqrt(eps) or cbrt(MachEps)         break       ph = h;       printf("%g diff=%g (%g - %g)\n", r, h, fl, fr); // diagnostics     }     if (d != 1)       printf("optimal d=%g\n", d);

This is a rough predictor that could be further improved. But this predictor is already reasonably effective to increase or decrease d in order to reduce the number of points evaluated by Exp-Sinh.

The predictor evaluates points too, but quits when an optimal d is found. There are ways to reduce this overhead, e.g. taking larger steps for d and perhaps followed by a second refinement step to adjust d.

With the simple predictor I've tested 208 integrals on [a,+inf) in the zip file with eps=1e-9: 29 are improved with 61 fewer points on average to integrate each of those.

- Rob

HP 71B,Prime G2;Ti VOY200,Nspire CXII CAS;Casio fx-CG50,fx-115ES+2;Sharp PC-G850VS,E500S,1475,1450,1360,1350,2500,1262,1500A
04-19-2021, 10:16 PM
Post: #66
 Albert Chan Senior Member Posts: 1,405 Joined: Jul 2018
(04-18-2021 12:56 PM)robve Wrote:  There is one bad case exp(-3*(x*x)-4/(x*x)) with 228 more points. The predictor pushes too hard on this one where the diff is close to zero already at the begin of the search, meaning there is no need to search for d in this case.
Note: this quote was removed, but it is instructive to see why it failed ...

Close to zero at the beginning of search is expected, for any f(x).
When v=1, f(v)*v - f(1/v)/v = f(1) - f(1) = 0

This is just the trivial solution, and cannot be used to predict d = 1.

To simplify, let v = e^t, instead of v = 2^t:

$$\displaystyle \int_0^∞ f(x)\,dx = \int_{-∞ }^∞ f(v)·v\,dt = \int_0^∞ \left[f(v)·v + f({1\over v})·{1\over v}\right]\,dt$$

Except for v=1, if last 2 term are same size, we assumed both will be tiny.
This should covered most of the (likely bell-shaped) area, thus can estimate optimal d.

---

Back to why f(x) = exp(-3*(x*x)-4/(x*x)) failed sign change search.

f(v)*v = f(1/v)/v, where v = e^t

Take log both side:

-3*exp(2t) - 4*exp(-2t) + t = -3*exp(-2t) - 4*exp(2t) - t
2t + exp(2t) - exp(-2t) = 0
t + sinh(t) = 0

On the left, both terms are increasing function, only solution is t=0.
Thus, except for the trivial v=1, there is no solution.

And, there are many more bad cases, say, g(x) = exp(-m*x), m ≥ 1

g(v)*v = g(1/v)/v

Again, substitute v=e^t, and take log both side:

-m*e^t + t = -m*e^-t - t
2t = m*(e^t - e^-t)
t = m * sinh(t)

sinh(t)﻿ ﻿ ﻿ ﻿ = t + t^3/3! + t^5/5! + ...
sinh(t)/t = 1 + t^2/3! + t^4/5! + ... ≥ 1

Removed trivial solution, (t = sinh(t) ⇒ t = 0), we have m < 1

---

Perhaps estimate of d based on peak of f(v)*v is safer ?
(there is always a peak, unless integral diverges)

f(x) = exp(-3*(x*x)-4/(x*x))
Searching for peak of f(v)*v (we can also do numerically):

diff(f(v)*v, v) = 0
f(v) + v*f'(v) = 0
f(v) + v*f(v)*(-6v + 8/v^3) = 0
1 - 6v^2 + 8/v^2 = 0
6v^4 - v^2 - 8 = 0﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ // assumed v ≠ 0
v^2 = (1+√193)/12 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ // we throw away the negative root, v^2 > 0
v = √((1+√193)/12) ≈ 1.114 ﻿ ﻿ ﻿ ﻿ // again, throw away negative root, v=e^t > 0

lua> f = function(x) x=x*x; return exp(-3*x-4/x) end
0.0005013071292635826 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 1.4667498799376226e-010
lua> Q.n
41

Same treatment for g(x) = exp(-m*x):

diff(g(v)*v, v) = 0
g(v)*1 + (g(v)*-m)*v = 0
1 - m*v = 0
v = 1/m ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ -- when g(v)*v peaked

Peak for d might not be "optimal", but may be good enough.
04-20-2021, 02:19 AM
Post: #67
 robve Member Posts: 140 Joined: Sep 2020
(04-19-2021 10:16 PM)Albert Chan Wrote:
(04-18-2021 12:56 PM)robve Wrote:  There is one bad case exp(-3*(x*x)-4/(x*x)) with 228 more points. The predictor pushes too hard on this one where the diff is close to zero already at the begin of the search, meaning there is no need to search for d in this case.
Note: this quote was removed, but it is instructive to see why it failed ...

Sharp eye to notice Rather than posting every small change in a new post which is harder and more time consuming to go through, this small change in this post avoids this problematic case for the reason I had already mentioned earlier in that post, the case when the initial diffs are too close to zero.

This small update also has another advantage. It allows the search for an optimized d from any initially-specified d instead of d=1. Perhaps this means that the method may have a fixed point that is the optimal d by invoking the predictor repeatedly with improved d and by using smaller steps. Interesting to investigate.

(04-19-2021 10:16 PM)Albert Chan Wrote:  And, there are many more bad cases, say, g(x) = exp(-m*x), m ≥ 1

Nope. The diff sign does not change for this integrand. And for larger m the search quits early.

(04-19-2021 10:16 PM)Albert Chan Wrote:  Perhaps estimate of d based on peak of f(v)*v is safer ?

I have not found a single counter example yet that does not work with the predictor. I also spent many hours to try various improvements and test them with >208 integrals. I have about a dozen tweaks of the method with mixed results. Not interesting to share.

A small change was needed to avoid the "bad" case and others similar to it, which are not difficult to recognize from the valid cases by inspecting the diff h sequence. Note that v (now more appropriately renamed r) should not be too small initially and is r=2 for the $$2^i$$ sequence. I also tried regula falsi to improve d, but unfortunately this does not always help. I also found that larger estimates of d (in the 1000s) tend to overshoot the optimal d. In this case starting with a larger d and then repeating the predictor recursively should improve the estimate closer to optimal. However, keeping things simple pays off, because moving d at least in the right direction helps convergence, even when we don't hit the optimal d, so we don't evaluate more points than necessary (or more than Exp-Sinh needs!). In any case, the geometric sequence of r can be $$2^i$$ or $$e^i$$ or another to take larger steps to reduce the number of evaluations. $$2^i$$ is just cheap to compute.

(04-19-2021 10:16 PM)Albert Chan Wrote:  f(x) = exp(-3*(x*x)-4/(x*x))
Searching for peak of f(v)*v (we can also do numerically)

I already tried searching for the "cliff" (or point after the peak). I had mentioned the cliff before as relevant. But it is easy to see that the search should continue until the sign changes. Only at the point when the sign changes the following comparison can be safely made to determine the direction that d should move in:

if (fabs(fl) < fabs(fr))
d /= r;
else
d *= r;

Doing this at the peak, or right after, fails the method, causing d to move in the wrong direction. However, the peak or cliff can be closer to where the optimal d is located for large d (in the 100s or 1000s), before the sign changes, but this adds more complexity. A simple condition is to check that the diff h is not zero and in that case move r back one step:

if (h != 0)
r /= 2;

If the diff is zero, we are at a reasonably good or optimal r to adjust d. This can happen very early for example 1/(1+(x*x)) and log((1+(x*x))/x)/(1+(x*x)) or late with for example x/sinh(x). For the last integrand we get d=4 (131 points, est.rel.err=2e-11) which improves the accuracy with the same number of points. Applying a regula falsi correction to r (instead of r/=2 see the code) we get d=6.7 (69 points, est.rel.err=6e-10). That's incredible

- Rob

HP 71B,Prime G2;Ti VOY200,Nspire CXII CAS;Casio fx-CG50,fx-115ES+2;Sharp PC-G850VS,E500S,1475,1450,1360,1350,2500,1262,1500A
04-20-2021, 01:58 PM
Post: #68
 Albert Chan Senior Member Posts: 1,405 Joined: Jul 2018
(04-20-2021 02:19 AM)robve Wrote:  I have not found a single counter example yet that does not work with the predictor.
...
If the diff is zero, we are at a reasonably good or optimal r to adjust d. This can happen very early
for example 1/(1+(x*x)) and log((1+(x*x))/x)/(1+(x*x)) ...

Is these two already counter-examples ?
Both functions, two sides overlapped, f(v)*v = f(1/v)/v

XCas> f1(x) := 1/(1+x*x)
XCas> f2(x) := log((1+(x*x))/x)/(1+(x*x))
XCas> simplify([f1(v)*v-f1(1/v)/v, f2(v)*v-f2(1/v)/v]) ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ → [0, 0]

In other words, f(e^t)*(e^t) is even function, with respect to t.

Technically, there were no sign changes.
But, with rounding errors, if lucky, we expected sign changes early on.
In other words, closed to whatever starting guess of d was.

It just happpened we started with d=1, and d=1 is close to optimal.

Now, try g1(x) = f1(x*10), g2(x) = f2(x*10)

Since this is simple scaling, we expected both have optimal d ≈ 1/10

I don't think you'd catch a sign change here (except for v=1, useless to predict d)
04-20-2021, 07:25 PM
Post: #69
 robve Member Posts: 140 Joined: Sep 2020
(04-20-2021 01:58 PM)Albert Chan Wrote:
(04-20-2021 02:19 AM)robve Wrote:  I have not found a single counter example yet that does not work with the predictor.
...
If the diff is zero, we are at a reasonably good or optimal r to adjust d. This can happen very early
for example 1/(1+(x*x)) and log((1+(x*x))/x)/(1+(x*x)) ...
Is these two already counter-examples ?
...
But, with rounding errors, if lucky, we expected sign changes early on.

I mentioned before that for cases like these the diff h is already zero, or very close to zero. If a sign change happens in that case it is noise/roundoff and should be ignored (as is done in the updated code). These "bad" integrands are easy to filter out without deteriorating the predictions for "good" integrands. Note that the diff h behavior has no peak/cliff with these. A relevant sign change is preceded by some (significant) curve with a peak, that cannot be produced by rounding error noise.

Maybe I will be forced to eat my words later if there are many "bad" cases. But so far the hundreds of examples tested don't appear to suggest there is a fundamental problem with this approach, just minor details.

Honestly, it is not too difficult to create some problematic "bad" cases that are based on periodic/oscillating functions over [0,∞) for which the diff h has a useless sign change more than once. Take $$\sin(40\pi x)/x^2$$ for example. But doing that is a bit "cheating", because these quadrature methods are known to perform poorly with such type of functions anyway (so it does not matter). IMHO it is more interesting to have a preprocessor to adjust d that works well for the Exp-Sinh integrands that we expect to work well with this quadrature. This should exclude "bad" integrands for which a good value of d cannot be determined to begin with, i.e. Exp-Sinh convergence is always bad with a large residual error. Also,guessing a suboptimal d is not fatal for the "good" integrands.

There are several things that are intriguing to look into:
- For large estimated d values the predicted d tends to overshoot the optimum non-linearly for large d in the 1000s. I tried a correction method with interpolation, but the results were mixed. My hunch is that the peak+cliff location should be taken into account somehow to adjust for large d (likewise, adjust for tiny d=1/r).
- Another way to address this sub-optimality problem for large d is to use a recursive scheme to improve d. We can try a sequence of four values, say r=2,10,100,10000 to detect very roughly where a sign change occurs, then adjust d and try again with some smaller points (say powers of 2) and again with a few points of smaller powers, etc.
- The predictor should quit early when d cannot be adjusted (no sign change) by testing the sequence begin/end points first e.g. r=2 and r=10000, to limit this overhead. Checking 1<r<2 is not useful. It only moves d upwards by a fraction which does not help much to improve convergence (downwards by a fraction is done with larger r, i.e. d = d/r).
- A solid theoretical analysis of this method is needed to better understand how robust it is.
- Can this test be used to quickly determine if an integrand is bad for Exp-Sinh to begin with?

- Rob

HP 71B,Prime G2;Ti VOY200,Nspire CXII CAS;Casio fx-CG50,fx-115ES+2;Sharp PC-G850VS,E500S,1475,1450,1360,1350,2500,1262,1500A
04-24-2021, 02:31 AM
Post: #70
 robve Member Posts: 140 Joined: Sep 2020
Some progress. The following updated and improved method predicts an optimal splitting point d for Exp-Sinh quadratures $$\int_a^\infty f(x)\,dx = \int_a^d f(x)\,dx + \int_d^\infty f(x)\,dx$$. The method uses bisection with an exponential scale. The method inspects 2 or 4 points to check if an optimization is applicable. Bisection uses just 8 points to find a close-to-optimal d. This reduces the overhead to only 5.95 evaluation points on average for the 208 integrals tested, requiring 179.38 points each on average with eps=1e-9.

Code:
// return optimized Exp-Sinh integral split point d double exp_sinh_opt_d(double (*f)(double), double a, double eps, double d) {   double h2 = f(a + d/2) - f(a + d*2)*4;   int i = 1, j = 32;     // j=32 is optimal to find r   if (isfinite(h2) && fabs(h2) > 1e-5) { // if |h2| > 2^-16     double r, fl, fr, h, s = 0, lfl, lfr, lr = 2;     do {                 // find max j such that fl and fr are finite       j /= 2;       r = 1 << (i + j);       fl = f(a + d/r);       fr = f(a + d*r)*r*r;       h = fl - fr;     } while (j > 1 && !isfinite(h));     if (j > 1 && isfinite(h) && sign(h) != sign(h2)) {       lfl = fl;          // last fl=f(a+d/r)       lfr = fr;          // last fr=f(a+d*r)*r*r       do {               // bisect in 4 iterations         j /= 2;         r = 1 << (i + j);         fl = f(a + d/r);         fr = f(a + d*r)*r*r;         h = fl - fr;         if (isfinite(h)) {           s += fabs(h);  // sum |h| to remove noisy cases           if (sign(h) == sign(h2)) {             i += j;      // search right half           }           else {         // search left half             lfl = fl;    // record last fl=f(a+d/r)             lfr = fr;    // record last fl=f(a+d*r)*r*r             lr = r;      // record last r           }         }       } while (j > 1);       if (s > eps) {     // if sum of |h| > eps         h = lfl - lfr;   // use last fl and fr before the sign change         r = lr;          // use last r before the sign change         if (h != 0)      // if last diff != 0, back up r by one step           r /= 2;         if (fabs(lfl) < fabs(lfr))           d /= r;        // move d closer to the finite endpoint         else           d *= r;        // move d closer to the infinite endpoint       }     }   }   return d; } // the quad() routine is updated in two branches to use this method:   else if (isfinite(a)) {     mode = 1; // Exp-Sinh     d = exp_sinh_opt_d(f, a, eps, d);     c = a;     v = a + d;   }   else if (isfinite(b)) {     mode = 1; // Exp-Sinh     d = exp_sinh_opt_d(f, b, eps, -d);     c = b;     v = b + d;     sign = -sign;   }

This new method of optimization is generic and not specifically tailored to the 208 integrands. Integrands that are suitable for Exp-Sinh will benefit, but those that do not converge or very slowly converge such as periodic and oscillating functions will not benefit from an optimized splitting point d. Of the 208 integrals tested, 29 are improved significantly in accuracy and/or number of points evaluated (63 fewer points on average). Of these 208 integrands, 14 are slightly worse (11 more points on average). However, practically there is almost no impact because 8 of these 14 integrands are not suitable for Exp-Sinh quadrature and return a large error with and without the optimized d. The full Exp-Sinh quadrature results with eps=1e-9 are listed in Appendix C.

- Rob

HP 71B,Prime G2;Ti VOY200,Nspire CXII CAS;Casio fx-CG50,fx-115ES+2;Sharp PC-G850VS,E500S,1475,1450,1360,1350,2500,1262,1500A
04-24-2021, 08:24 PM
Post: #71
 Albert Chan Senior Member Posts: 1,405 Joined: Jul 2018
(04-24-2021 02:31 AM)robve Wrote:  Some progress. The following updated and improved method predicts an optimal splitting point d for Exp-Sinh quadratures $$\int_a^\infty f(x)\,dx = \int_a^d f(x)\,dx + \int_d^\infty f(x)\,dx$$. The method uses bisection with an exponential scale. The method inspects 2 or 4 points to check if an optimization is applicable. Bisection uses just 8 points to find a close-to-optimal d.

Max. of 12 points ... impressive
For illustration purpose, I tried 1 from Appendix C, where suggested d does most good.
(there were 1 that does better, but integrand is oscillatory, so it might be just luck)

lua> f = function(x) local y=sqrt(x*x-1); return exp(-y*11/1000)*x/y end
lua> Q.exp_sinh_opt_d(f, 1) -- default eps=1e-9, d=1.
512

Code:
N   x         f(1+x)                   f(1+x)*x                sign(top-bot) 1   1/2       1.3252418334225524       0.6626209167112762 2   2         1.028168248457613        2.056336496915226            -     3   1/131072  255.99046499382536       0.0019530522536760357 4   131072    0                        0                            + 5   1/512     16.012410275011778       0.03127423881838238 6   512       0.003542270480289993     1.8136424859084763           - 7   1/8912    63.9948588286239         0.007180751663894064 8   8192      7.245672531983614e-040   5.9356549382009764e-036      + 9   1/2048    32.000714078579314       0.015625348671181306 10  2048      1.6271885884174952e-010  3.33248222907903e-007        + 11  1/1024    22.632978376948273       0.022102517946238548 12  1024      1.2686220440383643e-005  0.01299068973095285          +

I also made a version based on peak, and extrapolate for x-intercept.

I've noticed after the peak, the curve mostly go straight down.
(code use tangent line of last point, of the fitted quadratic)

Of course, with peak, there are really 2 valleys.
Code assumed optimal d is to the right of peak.

Code:
function Q.peak(f, a, d)     a = a or 0     d = d or 1     local L, R, k, P = f(a+d), f(a+d*2)*2, 2, huge-huge     if R/L < 1.2 then return Q.peak(f, a, d/100) end     while L/R < 1.2 do      -- stop if a bit passes peak         P, L, k = L, R, k+k         R = f(a+d*k)*k     end     d = d*k                 -- (0,R),(1,L),(2,P), 0 as reference     local d0, d1 = L-R, P-L -- finite difference, fit quadratic p(x)     P = d0/(d1-d0) - 1/2    -- interpolate for peak, p'(-x)=0, for x     R = R/(1.5*d0-0.5*d1)   -- extrapolate x-intercept, -x = R/p'(0)     return d*2^P, d*2^R     -- peak, x-intercept end

lua> Q.peak(f, 1)
89.13375894266683 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 399.8238469921945

Note that code evaluated an extra point passes the peak (when points are too close)
This is to get a more reasonable slope for the tangent line.
Without the extra point, we would get peak, x-intercept ≈ 88.25, 9106.

Code:
N   x         f(1+x)                   f(1+x)*x 1   1         1.1329087918426253       1.1329087918426253 2   2         1.028168248457613        2.056336496915226 3   4         0.9670764022530607       3.8683056090122427 4   8         0.9119448784679965       7.295559027743972 5   16        0.8311515864637837       13.29842538342054 6   32        0.6960220436099958       22.272705395519864 7   64        0.4892914169125088       31.314650682400565 8   128       0.2419734386421159       30.972600146190835 9   256       0.05919187288015089      15.153119457318628

Let's see effect of suggested d, vs default d=1

lua> quad(f, 1, huge) -- default d=1
90.90909088146049 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 3.999609164190213e-009 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 273
lua> quad(f, 1, huge, 512) -- sign change based d
90.90909089915341 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 6.360683358493127e-011 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 143
lua> quad(f, 1, huge, 400) -- x-intercept based d
90.90909089845148 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 2.0323842611238905e-011 ﻿ ﻿ ﻿ ﻿ 143
lua> quad(f, 1, huge, 89) -- peak based d
90.90909089656193 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 4.928270670014641e-011 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 141

The peak may be used for d.
In that case, exp-sinh transformed plot, when folded, turn into half a bell-shaped curve.
04-26-2021, 03:08 PM (This post was last modified: 04-27-2021 07:31 PM by Albert Chan.)
Post: #72
 Albert Chan Senior Member Posts: 1,405 Joined: Jul 2018
An improvement to previous post Q.peak().

Instead of calculating f(a+x)*x, this version use f(a+x), saving some computations.
Bonus: this may avoid overflow/underflow issues, if x is huge/tiny.

Code:
function Q.peak(f, a, d)    -- find peak of f(a+x)*x, guess=d     a = a or 0     d = d or 1     local k, L, C, R = 2, f(a+d), f(a+d*2)     while L/C > 2.353 do d=d/2; R=C; C=L; L=f(a+d) end     if R then d=d*4 else d=d*2; R=C; C=L end     while R/C > 0.425 do    -- stop if f(a+x)*x passes peak         d = d+d         L, C, R = C, R, f(a+d)     end     L, C = L/4, C/2         -- 3 points: (0,R), (1,C), (2,L)     local d0, d1 = C-R, L-C -- finite difference, fit quadratic p(x)     C = d0/(d1-d0) - 0.5    -- interpolate for peak, p'(x)=0, for -x     R = R/(1.5*d0-0.5*d1)   -- extrapolate x-intercept, -x = R/p'(0)     return d*2^C, d*2^R     -- peak, x-intercept end

Code assumed peak is bell-shaped, not bumpy.

Also, code is optimized for peak close to user inputted d.
Example, first integral from qthsh.pdf Appendix C:

lua> f = function(x) return exp(-x)/x end
lua> Q.peak(f, 1) -- default d=1
0.5980678336446792 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 3.218395385172793

With default d=1, code does 3 function evaluations. (note: code optimized away last column)
Code:
N   x         f(1+x)                   f(1+x)*x 1   1         0.06766764161830635      0.06766764161830635 2   2         0.01659568945595465      0.0331913789119093 3   1/2       0.14875344009895322      0.07437672004947661

lua> Q.quad(Q.count(f), 1, huge), Q.n ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ -- d = 1 (default)
0.2193839343909888 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 129
lua> Q.quad(Q.count(f), 1, huge, 3.218), Q.n ﻿ ﻿ ﻿ ﻿ ﻿ -- d ≈ extrapolated x-intercept
0.21938393439552076 ﻿ ﻿ ﻿ ﻿ 69

Update: code adjusted to quit early, if detected sign changes.
04-29-2021, 07:33 PM
Post: #73
 robve Member Posts: 140 Joined: Sep 2020
(04-26-2021 03:08 PM)Albert Chan Wrote:  An improvement to previous post Q.peak().

Great! The overhead cost to evaluate the peak per integrand is low, only 3.4 on average for the 208 integrals

The overall results with respect to reducing the number of Exp-Sinh evaluations needed per integrand for eps=1e-9:
91 improved by 65.9 fewer evaluations on average
91 are worse by 45.3 more evaluations on average

Funny the split in 91 and 91. Coincidence?

To compare, the current version of exp_sinh_opt_d for eps=1e-9 gives:
29 improved by 63.2 fewer evaluations on average
14 are worse by 11.3 more evaluations on average (almost all oscillatory cases that are known to be problematic for Exp-Sinh and are marked N/A)

I noticed that when a sign change is detected, peak() also does overall very well, except for one strange case that costs a lot more, 180 points more for log(1+9*(x*x))/(1+16*(x*x)) where d=10.9477 which for a small d<1 is better, meaning the movement of d is in the wrong direction? However, for the remaining 28 (or so) cases peak() can be closer to the optimum, simply because exp_sinh_opt_d() produces geometric values for d up to a large d=2^16 while peak() assumes the optimum is close to the given d (according to Albert). However, for exp_sinh_opt_d() we could refine d, e.g. with another step or with interpolation, which not done in the current exp_sinh_opt_d() version.

The tentative(!) results are in the attached PDF for exp_sinh_opt_d() (left) and peak() (right), with yellow rows indicating exp_sinh_opt_d() was applicable due to a sign change and orange rows also having a sign change but are marked N/A for bad ACCURACY because of large errors of the integration due to oscillatory functions etc. so these are not interesting to compare. The column ACCURACY shows IMPROVED if the abs.err is lower than the original unoptimized Exp-Sinh integral abs.err (not shown in the PDF). However, ACCURACY can be noisy because the abs.err is so small. The abs.err of the relevant cases is mostly better than the specified eps=1e-9 anyway, meaning we should be fine anyway:

I feel that progress has been made for both methods. However, when I have time, I plan to take a closer look at further improvements and directions to go with this.

Also, I assume something similar can be done for Sinh-Sinh to split the integral at a more optimal point in some cases.

Albert, you have the following "magic numbers" hardcoded without explanation:
while L/C > 2.353 do
while R/C > 0.425 do

Are these empirically established?

- Rob

HP 71B,Prime G2;Ti VOY200,Nspire CXII CAS;Casio fx-CG50,fx-115ES+2;Sharp PC-G850VS,E500S,1475,1450,1360,1350,2500,1262,1500A
04-29-2021, 11:38 PM
Post: #74
 Albert Chan Senior Member Posts: 1,405 Joined: Jul 2018
(04-29-2021 07:33 PM)robve Wrote:  I noticed that when a sign change is detected, peak() also does overall very well, except for one strange case that costs a lot more, 180 points more for log(1+9*(x*x))/(1+16*(x*x)) where d=10.9477 which for a small d<1 is better, meaning the movement of d is in the wrong direction?

Q.peak() got the direction wrong. (with points at 1/4, 1/2, 1. left side intercept ≈ 0.158)

Q.peak() always go for the x-intercept of the right.
In other words, it tried to push bulk of the area to $$\int_a^{a+d}$$
Sometimes that is not appropriate, when we really need more points close to a.

Without knowing the shape of curve, peak for d is the safe choice.

Quote:Albert, you have the following "magic numbers" hardcoded without explanation:
while L/C > 2.353 do
while R/C > 0.425 do

Are these empirically established?

With quadratic fit, last two points cannot be too flat, otherwise tangent line would be way off.
I had based on $$\int_0^∞ e^{-mx}\,dx$$, with optimal d ≈ 6/m to 7/m.

Let f(x) = exp(-m*x), g(x) = f(x)*x

With g(2x)/g(x) ≈ 85%, we considered it just passes the peak.

f(2x)/f(x) ≈ 0.85/2 = 0.425, or reverse direction: 1/0.425 = 2.353

With 85% setting, x-intercept roughly hit the sweet spot.

lua> F = function(m) return function(x) return exp(-m*x) end end
lua> for m=1,9 do p,x=Q.peak(F(m)); print(m, m*x) end
1 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 5.734205371025561
2 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 5.734205371025561
3 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 4.41325021172599
4 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 5.734205371025561
5 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 4.509031895026867
6 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 4.41325021172599
7 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 8.628727480200146
8 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 5.734205371025561
9 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 4.841791367031226
04-30-2021, 05:13 PM
Post: #75
 Albert Chan Senior Member Posts: 1,405 Joined: Jul 2018
(04-29-2021 11:38 PM)Albert Chan Wrote:  Without knowing the shape of curve, peak for d is the safe choice.

This is work in progress, but if we got peak and both x-intercepts, we have an idea of curve shape.

Code:
function Q.peak2(f, a, d)       -- find peak of f(a+x)*x, guess=d     a = a or 0     d = d or 1     local k, L, C, R = 2, f(a+d), f(a+d*2)     while L/C > 2.353 do d=d/2; R=C; C=L; L=f(a+d) end     if R then d=d*4 else d=d*2; R=C; C=L end     while R/C > 0.425 do        -- stop if f(a+x)*x passes peak         d = d+d         L, C, R = C, R, f(a+d)     end     L, C = L/4, C/2             -- 3 points: (0,R),(1,C),(2,L)     local d0, d1 = C-R, L-C     -- finite difference, fit quadratic p(x)     local P = d0/(d1-d0) - 0.5  -- interpolate for peak, p'(x)=0, for -x     R = R/(1.5*d0-0.5*d1)       -- extrapolate x-intercept, -x = R/p'(0)         local shift = (L/C > 0.85)  -- extra point for left-side x-intercept        if shift then C=L; L=f(a+d/8)/8; d0=d1; d1=L-C end     L = L/(1.5*d1-0.5*d0) - (shift and 3 or 2)     return d*2^P, L-P, R-P  -- both x-intercepts, log2 scale, rel. to peak end

lua> f = function(x) return log(1+9*(x*x))/(1+16*(x*x)) end -- previous post example
lua> Q.peak2(f1)
0.9456442962396107 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ -2.580895653494933 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 3.5331822306621703

Peak for f(x)*x, at x ≈ 0.9456
The other 2 numbers are x-intercepts, in log2 scale, relative to peak:
Left-side x-intercept ≈ 0.9456 * 2^-2.581 ≈ 0.158
Right-side x-intercept ≈ 0.9456 * 2^3.533 ≈ 10.9

We should pick the steeper side (x-intercept with minimum log2 absolute value)
If there is no steeper side, curve is relatively symmetrical, just use peak for d.

Recommended d = left-side = 0.158

---

Both of these has peak at 1.0 ... which side to use for d ?

lua> f1 = function(x) return exp(-x) end
lua> f2 = function(y) return f1(1/y)/y^2 end
lua> Q.peak2(f1)
0.932573172731174 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ -2.983219913988427 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 2.6203047422648424
lua> Q.peak2(f2)
1.072301916075236 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ -2.6203047422648424 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 2.9832199139884263

Noted the symmetry of f1, f2 intercepts. This is expected, f2 ≡ f1, with x=1/y, dx=-dy/y²
Again, picking the steeper side, symmetry is carried to optimal d:

f1 optimal d ≈ 2^2.62 ≈ 6.15
f2 optimal d ≈ 2^-2.62 ≈ 0.163
 « Next Oldest | Next Newest »

User(s) browsing this thread: 1 Guest(s)