[WP34s] Regularized incomplete Beta function

05042014, 05:26 PM
(This post was last modified: 05042014 05:33 PM by Dieter.)
Post: #21




RE: [WP34s] Regularized incomplete Beta function
(05042014 06:29 AM)Paul Dale Wrote: I can only agree with you here. I'm using a fairly generic Newton based solver for all the quantile functions except the normal QF. I know it can be very slow, that is the price for having a single general solver. Even a simple Newton solver should converge quadratically. I think the slow execution of the current 34s quantile implementations is due to an error in the initial guess for the solver. It should be within approx. 10% of the true quantile. So even a Newton solver should not require much more than five iterations, which means 6  10 s in user code. Quote:The 34S certainly doesn't have space for a C implementation of these. I originally implemented all the distribution functions in C but had to rewrite them as keystroke programs for space reasons. Adding additional custom solvers is likewise going to consume precious bytes. Thus, the 34S is unlikely to see any distribution speed ups. However, the 31S quite possibly will. Great. And many things are easier to do as there is no DP mode. ;) Quote:I'd like to rewrite the distribution code there in native C  I can't just drop in the original distribution code, we moved on algorithmically. I recently tried some improvements, especially with the solver. The following ideas refer to the Student distribution, but they should be applicable for the Chi² and Fisher cases as well. The following thoughts assume the quantile is ≥ 0 and p is the upper tail probability (i.e. p ≤ 0.5). As usual, the rest is done with a simple transformation, e.g. QF(1–p) = –QF(p). So, what can be done? 1. Use a Halley solver. It converges very fast and it can be implemented easily since the second derivative is a function of the first (i.e. the pdf). Student: f"(x) = (n+1)x / (n+x^2) * pdf Chi²: f"(x) = (nx2)/(2x) * pdf Fisher: f"(x) = (n(m(x1)+2) + 2mx) / (2x(mx+n)) * pdf where n (resp. m and n) are the degrees of freedom. So f"(x) simply is the pdf times a factor. This way a Halley iteration e.g. for the Student case can be written as follows: r := (StudentUpperCDF(x)  p) / StudentPDF d := r / ( r * (n+1)x / 2(n+x^2)  1 ) x := x  d Due to the roughly cubic convergence, the iteration usually may quit as soon as d < 1E–10*x for 30+ digit accuracy. In my 34s program I tried a conservative CNVG 00 (i.e. rel. error < 1E–14) which usually returns results that are as good as a user code program gets when running in DP mode (approx 3034 digits). The same idea can be used with the Chi² and Fisher quantile. 2. I slightly modified the initial guess for the Chi² quantile and I tried a new approach in the Student case, based on a 1992 paper on bounds of various quantiles by Fujikoshi and Mukaihata. The idea is a simple transformation of the normal quantile z: t = sqrt(n * (e^{a * z^2/n}  1 )) Where a is close to 1 and a function of n (or simply 1 so that it can be omitted). I used a = 1 + 1/(e*n) which works very well with the Normal estimate I used (simply the one in the Normal quantile function). This works fine for the center but less so for the distribution tails. For all p < 12^{–n} I use a slight modification of the tail approximation suggested years ago: u = 2 * p * n * sqrt(pi / (2 * n  0.75)) t = sqrt(n) / u ^ (1 / n) The 0.75 originally was a 1, but changing this value improves the results for low n. Although the Student estimate originally was intended for n≥3 it also works well for n=2 or even n=1. Usually it converges within three iterations, here and there in maybe four. This means that the code for n=1 and n=2 (direct solutions) may be omitted. For x close to 0 (i.e. p near 0.5) the expression t_u(x) – p loses accuracy due to digit cancellation. So I used the same idea as in the Normal quantile routine and had this value calculated differently for small t, using the incomplete Beta function. Yes, that's why I found the bug discussed in this thread. ;) Code:
For best accuracy this should run in DP mode. The program exits if the last two approximations agree in approx. 14 digits. At this point the result usually carries 30+ valid digits. Here are some examples: Code:
Please note that the results for n=1 are exact to 33 resp. 34 digits while the current implementation (that calculates the result directly) gets only 32 resp. 24 (!) digits right. Either the internal tangent function is not that accurate or the current implementation does not evaluate the quantile as 1/tan(1E–10*180°) which would yield a nearly perfect result. ;) Dieter 

05052014, 02:20 AM
Post: #22




RE: [WP34s] Regularized incomplete Beta function
(05042014 05:26 PM)Dieter Wrote: ... 24 (!) digits right. Either the internal tangent function is not that accurate or the current implementation does not evaluate the quantile as 1/tan(1E–10*180°) which would yield a nearly perfect result. ;) The one df case is implemented as tan(pi(x1/2)) which loses digits in cases like these. I missed the transformation to cot(pi x) when I implemented this I'm pretty sure TAN is up to the task. More usefully, your QF implementation isn't much longer than the current one and it avoids the generic solver. I might see if I can substitute it on the 34S too. It isn't all that straightforward because several functions you use are implemented in XROM and can be used directly. Rather, they have to be called using XEQ and they'll mess the stack up.  Pauli 

05052014, 08:26 AM
Post: #23




RE: [WP34s] Regularized incomplete Beta function
After a bit of debugging, the new StudentT quantile function code appears to be working on both the 34S and the 31S. The next builds will get this
 Pauli 

05052014, 11:02 AM
Post: #24




RE: [WP34s] Regularized incomplete Beta function
Is it worthwhile using CNVG? 03 which will choose 1e24 or 1e32 for the convergence criteria (this command is more conservative in XROM than in user code)? This will mean an additional iteration but will give better result in DP mode. I'd probably also increase the iteration limit to 6 or 7 to avoid the error case.
I don't see why it would be worthwhile given the fast convergence of Halley's method. We'd only be getting at most a couple of extra digits in DP mode.  Pauli 

05052014, 11:15 AM
Post: #25




RE: [WP34s] Regularized incomplete Beta function
The normal QF estimate used in your Student'sT is slightly different to the one I used in the normal.wp34s file. Only on the < .23 branch in the code.
Is the difference significant?  Pauli 

05052014, 08:14 PM
Post: #26




RE: [WP34s] Regularized incomplete Beta function
(05052014 11:15 AM)Paul Dale Wrote: The normal QF estimate used in your Student'sT is slightly different to the one I used in the normal.wp34s file. Only on the < .23 branch in the code. No. I just simplified the last term 0,254/u to 1/4u. The original version even is a tiny bit better. So you should leave it the way it is. I remember it was also finetuned to make sure the Normal quantile safely returns 32...34 digits in just two iterations. If all the distributions get reworked we could also switch to a simple rational approximation for the Normal estimate. I suppose a (2;2) type, i.e. with five real constants, should be sufficiently accurate. Since the initial guesses for the other distributions rely on a Normal guess we just have do decide now whether to switch or not. Designing such an approximation would require some time. On the other hand, everything also works fine with the current solution. Dieter 

05052014, 08:30 PM
Post: #27




RE: [WP34s] Regularized incomplete Beta function
(05052014 11:02 AM)Paul Dale Wrote: Is it worthwhile using CNVG? 03 which will choose 1e24 or 1e32 for the convergence criteria (this command is more conservative in XROM than in user code)? This will mean an additional iteration but will give better result in DP mode. If it really improves the result in DP mode and DP mode is available (i.e. on the 34s) it should be done  "accuracy is not optional". ;) However, if the proposed routine runs in DP mode I could not find a case where the result had less than 30 correct digits. Quote:I'd probably also increase the iteration limit to 6 or 7 to avoid the error case. During my tests the result always converged within 3, sometimes 4 iterations. A possible fifth loop was added "just to be sure". The error message was essentially included for testing to get a feedback if something really goes wrong. It should never show up, and during my tests it never did. Quote:I don't see why it would be worthwhile given the fast convergence of Halley's method. We'd only be getting at most a couple of extra digits in DP mode. If (!) we can get some additional digits and (!) DP mode is available (34s) another iteration is fine by me. But honestly I do not think that this will happen. Do you have an example where a fifth iteration yields better accuracy? FTR: for my own tests I used a threshold of 1E8 for a SP result. I think 1E14 is on the safe side, even in DP mode. Dieter 

05052014, 09:13 PM
(This post was last modified: 05052014 09:45 PM by Dieter.)
Post: #28




RE: [WP34s] Regularized incomplete Beta function
(05052014 02:20 AM)Paul Dale Wrote: The one df case is implemented as tan(pi(x1/2)) which loses digits in cases like these. If this really got implemented it should quit with an error message as soon as x resp. p is small enough so that 0,5–p equals 0,5. E.g. p=1E50 becomes tan(pi/2) which is not defined. But since there is no exact ndigit value for pi/2, the result becomes meaningless: my 34s returns t^{–1}(1E–50, 1) as +1,7924 E+33. Yes, a positive result. For comparison, the true result is –3,1831 E+49. Actually +1,7924 E+33 is returned for any (!) input below 1E34. Arguments below 0,1 lose one digit per magnitude. That's why t^{–1}(1E–10, 1) has merely 34–10 = 24 valid digits. And so the result is bogus for any input below 1E–34. In other words, the current implementation is not just inaccurate, it's simply buggy. Sorry. ;) Quote:I missed the transformation to cot(pi x) when I implemented this I'm pretty sure TAN is up to the task. That's exactly why the Normal and now the Student QF code use two different ways of evaluating CDF(x) – p. A special one for cases close to the center and the straightforward method otherwise. Likewise, if the Student dof=1 case is calculated directly, two different methods should be used as well: if p ≤ p_{crit}: t = –cot(pi · p) if p > p_{crit}: t = tan(pi · (p – 1/2)) Where p_{crit} can be anything between 0,1 and 0,4. Quote:More usefully, your QF implementation isn't much longer than the current one and it avoids the generic solver. I might see if I can substitute it on the 34S too. It isn't all that straightforward because several functions you use are implemented in XROM and can be used directly. Rather, they have to be called using XEQ and they'll mess the stack up. I see. Please also do your own tests. And I would love to know how fast this might run in compiled C code, e.g. on a 31s. ;) Dieter 

05052014, 09:40 PM
Post: #29




RE: [WP34s] Regularized incomplete Beta function
Thanks for the clarification. I'll use the existing normal QF estimate in the Student's T code to save some bytes.
 Pauli 

05052014, 09:49 PM
Post: #30




RE: [WP34s] Regularized incomplete Beta function
(05052014 08:30 PM)Dieter Wrote: However, if the proposed routine runs in DP mode I could not find a case where the result had less than 30 correct digits. The XROM code runs in double precision regardless of the user's setting. This is good for rounding back to single precision. Changing the convergence check will pretty much force an extra iteration to detect the convergence. Thus it will cause a slow down. So an increase to the maximum to 6 seems reasonable "just to be sure". No, I don't have an example that increases accuracy. I'm leaning towards the minimum of thirty digits being sufficient in DP, since we don't guarantee accuracy there.  Pauli 

05052014, 09:57 PM
Post: #31




RE: [WP34s] Regularized incomplete Beta function
(05052014 09:13 PM)Dieter Wrote: If this really got implemented it should quit with an error message as soon as ... I'm slightly embarrassed that this bit of code made it in and has been unnoticed for so long I don't know what I was thinking when I implemented it  probably rushing through recoding all the distributions. The good news is the modified Student's T code is in both the 31S and the 34S. Should be a nice speed up and bypasses the poor implementation of the one degree of freedom case. Quote:And I would love to know how fast this might run in compiled C code, e.g. on a 31s. ;) I suspect not as much faster as we'd like. The incomplete beta function, which is the heart of of the CDF, is in C already. Likewise for LogGamma in the PDF. The rest is predominately shuffling and scaling values which would speed up in C but I doubt they are the bulk of the time. C would give five more digits in intermediates. This would justify an extra iteration in DP mode I think.  Pauli 

05052014, 10:04 PM
(This post was last modified: 05052014 10:05 PM by Dieter.)
Post: #32




RE: [WP34s] Regularized incomplete Beta function
(05052014 09:49 PM)Paul Dale Wrote: Changing the convergence check will pretty much force an extra iteration to detect the convergence. Thus it will cause a slow down. So an increase to the maximum to 6 seems reasonable "just to be sure". Changing the convergence check from 1E14 to 1E24 equals roughly half an iteration (on average). ;) I am not sure if I understand this correctly, but do you say that CNVG? 00 which usually tests a relative error < 1E–14 automatically switches to 1E–24 if run in XROM? I think that 1E–14 is on the safe side. Quote:No, I don't have an example that increases accuracy. I'm leaning towards the minimum of thirty digits being sufficient in DP, since we don't guarantee accuracy there. Yes, that makes sense since this is what can be achieved with 34digit user code. The one or other last digit may be off. For my earlier humble contributions to the 34s project I tried to get 32 digits +/– 1 ULP. Dieter 

05052014, 10:16 PM
Post: #33




RE: [WP34s] Regularized incomplete Beta function
(05052014 10:04 PM)Dieter Wrote: I am not sure if I understand this correctly, but do you say that CNVG? 00 which usually tests a relative error < 1E–14 automatically switches to 1E–24 if run in XROM? I think that 1E–14 is on the safe side. No. CNVG? 00 will always check against 1E14. Depending on mode, CNVG? 03 chooses either 1E14 or 1E32 in user code but 1E24 or 1E32 in XROM.  Pauli 

05052014, 11:09 PM
(This post was last modified: 05052014 11:10 PM by Dieter.)
Post: #34




RE: [WP34s] Regularized incomplete Beta function
(05052014 10:16 PM)Paul Dale Wrote: CNVG? 00 will always check against 1E14. Depending on mode, CNVG? 03 chooses either 1E14 or 1E32 in user code but 1E24 or 1E32 in XROM. I now did the following test: After the quantile has been computed, a relative error of 1E–12 was added back to the result and the iteration restarted. Some code was added to display the relative error of the current approximation. As expected, the next iteration showed 1E–12. The following one mostly displayed a plain zero (no correction, result is considered exact) or an error near 1...5E–34. In two cases a correction near 1...3E–33 was applied. In other words, even if the last correction was 1E–12, it looks like we can expect 32...34 valid digits. So I assume a test against 1E–14 should be on the safe side. Dieter 

05062014, 12:54 AM
Post: #35




RE: [WP34s] Regularized incomplete Beta function
(05052014 11:09 PM)Dieter Wrote: In other words, even if the last correction was 1E–12, it looks like we can expect 32...34 valid digits. So I assume a test against 1E–14 should be on the safe side. Wouldn't we have to force one additional iteration after achieving the convergence precision to guarantee this level of accuracy? A value that differed from the previous by, say, 1E15, would pass the convergence test yet not be good. I'm thinking along the lines of: Code: FS? .02 instead of the: Code: CNVG? 00  Pauli 

05062014, 11:01 PM
(This post was last modified: 05062014 11:08 PM by Dieter.)
Post: #36




RE: [WP34s] Regularized incomplete Beta function
(05062014 12:54 AM)Paul Dale Wrote: Wouldn't we have to force one additional iteration after achieving the convergence precision to guarantee this level of accuracy? A value that differed from the previous by, say, 1E15, would pass the convergence test yet not be good. If the last correction was 1E15, this was the error of the previous approximation. Since Halley's method converges roughly cubic, the expected remaining error after this correction is below 1E40. Quote:I'm thinking along the lines of: That's the method that is used in Lambert's W function as well as (sort of) in the Integrate routine. But I think we can save this additional iteration. I set up a test program for the emulator and tested a dozen of different input values from 0,4999999999999999 to 1E300, with 1 to 10000 degrees of freedom (including every single value between 1 and 100). The results showed an interesting pattern: 1. The number of uncertain (last) digits in a perfectly converged result is quite uniformly the number of digits in the degrees of freedom. In other words: for n=1...9 you can expect 33 valid digits, for 10...99 it's 32 digits, for n=100..999 it's 31 digits, and so on. I assume this is because of the term x² + n within the iteration loop. 2. This level of accuracy is reached as soon as the first 12 digits agree. So with 14 matching digits we should be on the safe side, as already mentioned. I think that this should be sufficient even for a 39digit C function. 3. and most important: further iterations do not seem to guarantee a better result. Unless CDF(t) – p is exactly zero, they just cycle through various permutations of the last digits. Example: p=0,1 with 20 degrees of freedom. Initial guess for t: 1,312400768070987055677223263645284 After 1st iteration: 1,325340121782327835195769644183172 After 2nd iteration: 1,325340706985046343045588015602724 After 3rd iteration: 1,325340706985046343099953113344022 *** at this point CNVG 00 considers the quantile converged *** After 4th iteration: 1,325340706985046343099953113344020 After 5th iteration: 1,325340706985046343099953113344022 After 6th iteration: 1,325340706985046343099953113344020 ... True 34digit value: 1,325340706985046343099953113344018 4. There are cases where an additional iteration corrects the last digit(s) and then the approximation does not change any more. This does not neccessarily mean that the added iteration improves the result. Example: p=0,05 with 20 degrees of freedom After 3rd iteration: 1,724718242920787272832061725382084 This is the exact 34digit result. Further iterations change this to ...082 and then stay there. That's why I think that CNVG 00 should be fine here. Dieter 

05072014, 12:12 AM
Post: #37




RE: [WP34s] Regularized incomplete Beta function  
05082014, 06:26 PM
Post: #38




RE: [WP34s] Regularized incomplete Beta function
(05072014 12:12 AM)Paul Dale Wrote: I understand now. I now tend to think the accuracy limit is not in the quantile code but in the Student CDF itself (resp. in the probably underlying Incomplete Beta function). Comparing some results with the 34s t_u function showed the wellknown pattern: Code:
This pattern is consistent with information on the accuracy of the one or other Incomplete Beta algorithm included in mathematical software libraries. As a rule of thumb the number of potentially inaccurate digits seems to match the number of digits in the degrees of freedom. Which leads to the question if an upper limit in the dof value would make sense. For instance a maximum of 1000. This would apply to the Student as well as the Chi² and F distribution. Dieter 

05082014, 10:28 PM
Post: #39




RE: [WP34s] Regularized incomplete Beta function
(05082014 06:26 PM)Dieter Wrote: Which leads to the question if an upper limit in the dof value would make sense. For instance a maximum of 1000. This would apply to the Student as well as the Chi² and F distribution. Given we're only promising accuracy for single precision, there are many orders of magnitude left before the risk of digit loss becomes significant. I don't think I'd want to impose a limit suitable for DP that restricts results in SP, likewise I don't want a computation that errors out in DP but not SP. For Student'sT and Chi² distributions we could approximate with a normal for very high degrees of freedom I guess. I don't remember the asymptotic properties of the F distribution when one or both degrees of freedom get large.  Pauli 

05082014, 10:58 PM
Post: #40




RE: [WP34s] Regularized incomplete Beta function
(05082014 10:28 PM)Paul Dale Wrote: Given we're only promising accuracy for single precision, there are many orders of magnitude left before the risk of digit loss becomes significant. The user may enter dof values that lead to problems and strange results. At the moment I am looking at the Chi² quantile. For Chi² = n the CDF should be near 0,5 (slightly larger). Try this with n=2000 and you get 0,5042... (correct). Then try n=3000 and the 34s returns zero (!). Quote:For Student'sT and Chi² distributions we could approximate with a normal for very high degrees of freedom I guess. I don't remember the asymptotic properties of the F distribution when one or both degrees of freedom get large. I think that indeed very large dof values will be required until the Normal CDF matches the others only for SP accuracy. ;) Dieter 

« Next Oldest  Next Newest »

User(s) browsing this thread: 1 Guest(s)