HP PPL and Statistical distributions
|
04-11-2021, 08:49 AM
Post: #1
|
|||
|
|||
HP PPL and Statistical distributions
Goodmorning everyone. Can any of you point me to an algorithm written in HP PPL for the calculation of the "studentized range q distribution" CDF?
Thanks, Roberto. |
|||
04-11-2021, 09:51 AM
Post: #2
|
|||
|
|||
RE: HP PPL and Statistical distributions
In truth, HP PRIME is able to directly solve the integral that calculates the "P_Value" of the "studentized range q distribution". However, if the HP Prime emulator resolves the integral in about 1 second, the "hardware" HP Prime G2 resolves the integral in about 33 seconds. I therefore ask if there is an algorithm written in HP PPL to quickly calculate the "P_Value" of "studentized range q distribution", even using the HP Prime G2.
Thanks again, Roberto. #cas Studentized_Range_q_P_Value(q,k,ν):= BEGIN LOCAL funzione; funzione:=(√(2*π)*k*ν^(ν/2))/ (Gamma(ν/2)*2^(ν/2-1))*∫(s^(ν-1)* e^((-ν*s^2/2))/(√(2*π))*∫(e^((-z^2)/2)/ (√(2*π))*(normald_cdf(0,1,z+q*s)- normald_cdf(0,1,z))^(k-1),z, -(∞),+∞),s,0,+∞); RETURN 1-funzione; END; #end |
|||
04-13-2021, 05:43 AM
Post: #3
|
|||
|
|||
RE: HP PPL and Statistical distributions
I did some digging on this. The root of the issue is a typical problem of mathematical calculation. Expressions which are nice for math, are not always nice numerically. Even my Mathematica struggles on it. You need either a simpler expression or a fast approximate method. Of course the nature of either is it will be less general than the pure form, but how to handle that depends a lot on your application.
On the expression side, Wikipedia has nice closed-form solutions for special values of k, if you can be satisfied with a less general solution. If those don't work it might be possible to get some bounds for your application and ask a big boy CAS like Mathematica if it can find a nice solution. I did ask it to simplify the general formula on the offchance it can prove some new math for us, it appears it cannot. The simplest and typical implementation is to use tables and interpolate. For example, this is the approach given here. I also saw an approach like this recommended in one of the older papers on this function. Basically, load up a matrix with the table, do lookups and average nearby values. One of the papers I read recommended harmonic interpolation between table entries but that may be overkill. Another approach is to use a numerical method to slice open the nasty integrals, such as some kind of quadrature rule. Glancing at the formula, I'd guess you use some Laguerre method on one integral and a Hermite method on the other. Fortunately we don't have to guess, there are known algorithms for this function. Here is an older method (indeed, seems to be Laguerrish) and here is a modern one from R. Some of these methods have limitations on the parameters. There's also this survey of similar methods which has a large appendix with implementations. Unfortunately none of these are HP PPL, but it's probably straightforward to translate. |
|||
04-13-2021, 08:45 AM
Post: #4
|
|||
|
|||
RE: HP PPL and Statistical distributions
Hello, thank you very much for such a thorough answer. I'll ask you another question: the calculation of numerical integrals, by utilising the HP 50g calculator, provides, in setting the values, the calculation accuracy. For example, if I set a "Fixed 3" "number format", the accuracy with which the numerical integral is calculated includes up to the third decimal number. By setting a "low" accuracy, the HP 50g calculates the integral more quickly. Is it possible to set, in the HP PRIME calculator, a "low" calculation accuracy, in order to speed up the calculation of the integral?
best regards, Roberto Mioni. |
|||
04-13-2021, 08:18 PM
Post: #5
|
|||
|
|||
RE: HP PPL and Statistical distributions
This might be better answered by somebody who knows the calculator better. Part of the issue is that for a function like Integrate, it tries a lot of different methods. I believe that Prime, Giac/Xcas, and some HP48 routines can take a crack at integration, and each of those codebases have many methods inside them. I think some of these methods do look at CAS or Home precision, but I'm not sure all of them do. Based on the errors I get running your program, it seems a lot of time is spent trying different methods, and so that may be an issue regardless of accuracy.
You might have better luck using a specific method. There is a builtin romberg function, I don't know if it can handle your integrals but it's worth a try to replace them with romberg calls and see how it performs. There's also some numerical methods for the Prime written by Jhonatan Peretz that you can try, in particular he has a Legendere quadrature that might work well. Back to your question, there are specific integration methods that trade off time for accuracy. Monte carlo integration is the first that comes to mind, since you can reduce the iteration count. There is a double integral version for PPL by Eddie Shore, but it's probably simple enough you can port the simpler single algorithm from anywhere. The trick will be the infinite integration limits. Usually you can just use large/small values if the contribution outside the range will be small. |
|||
05-18-2021, 03:38 PM
Post: #6
|
|||
|
|||
RE: HP PPL and Statistical distributions
(04-13-2021 08:18 PM)compaqdrew Wrote: The trick will be the infinite integration limits. This is indeed the case. Recently, we discussed tanh-sinh, exp-sinh, sinh-sinh algorithm here Combined as 1, we have quad (my Lua implementation here, emace67 double_exponential.py here). First, I clean-up CAS code, and run in XCas (s lower limit changed to 0. produced some speed-up) Code: Studentized_Range_q_P_Value(q,k,ν):= Picked an example, from studentized-range-q-table From first page, P-value = 0.10, k=8, df=12, we have q = 4.511 XCas> Studentized_Range_q_P_Value(4.511, 8, 12) Evaluation time: 6.24, returned 0.100017379207 Now, rewrite code in lua, using quad to integrate. Code: do lua> Studentized_Range_q_P_Value(4.511, 8, 12) 0.10001737920797527 We got the same answer ... But, LuaJIT take 0.003 second |
|||
05-18-2021, 05:23 PM
Post: #7
|
|||
|
|||
RE: HP PPL and Statistical distributions
Hello Albert Chan, how do I write the lua code in the HP PRIME?
Thanks a lot, Roberto |
|||
05-18-2021, 06:03 PM
Post: #8
|
|||
|
|||
RE: HP PPL and Statistical distributions
Is it possible to implement the Lua implementation of quad () for HP PRIME (perhaps with Python)?
|
|||
05-18-2021, 11:15 PM
(This post was last modified: 05-25-2021 12:24 PM by Albert Chan.)
Post: #9
|
|||
|
|||
RE: HP PPL and Statistical distributions
Hi, robmio
Here is the translated quad, in PPL. I hope this is right, as I am learning PPL as I code. Compare to Lua or Python, there are some gothas: 1. all local variables have to declare up front. 2. there is no Lua elseif or Python elif equivalent. 3. there is no logic precedence: a OR b AND c translated to ((a OR b) AND c) 4. comparison operators has "sugar" in it, which may translated wrongly. Cas> a > b > c → a > b AND b > c ... looks good Cas> a > b > c > d → ((a>b) AND (b>c)) > d ??? Cas> (a>b) == (mode == 2) → a>b and b == mode == 2 ??? Code: #cas Usage: Cas> pdf(x) := exp(-x*x/2) / sqrt(2*pi) Cas> quad(pdf, -1, 1) → [0.682689492137, 2.08159857137e−14] Cas> quad(pdf, 1, inf) → [0.158655253928, 1.81990772641e−10] Cas> quad(pdf, 1, -inf) → [-0.841344746, 7.74963867073e−11] Cas> quad(pdf, -inf, 1) → [ 0.841344746, 7.74963867073e−11] Cas> quad(pdf, -inf, inf) → [1., 8.45545855554e−13] |
|||
05-19-2021, 02:39 PM
Post: #10
|
|||
|
|||
RE: HP PPL and Statistical distributions
I was unable to get double integrals using only quad. Any ideas ?
Second best option, wrap outer integral with quad, inner with int. Code: #cas // Studentized P_Value, from q,k,v Cas> SRP_qkv(4.511, 8, 12) → 0.100017379208, finished in 0.45 second For unknown reason, Studentized_Range_q_P_Value(4.511, 8, 12) hang the emulator. Is there a way to interrupt calculations, besides restarting emulator ? |
|||
05-19-2021, 04:56 PM
Post: #11
|
|||
|
|||
RE: HP PPL and Statistical distributions
(05-19-2021 02:39 PM)Albert Chan Wrote: I was unable to get double integrals using only quad. Any ideas ? Problem solved ! Parameters of function also required declared as local variables. I had discovered this when checking memory of Cas variables. I would have guessed parameters local by default ... weird. Code: #cas // Studentized P_Value, from q,k,v Cas> SRP_qkv(4.511, 8, 12) → 0.100017379208, finished in 0.184 second |
|||
05-19-2021, 06:00 PM
Post: #12
|
|||
|
|||
RE: HP PPL and Statistical distributions
Dear Albert Chan, it's strange, even your latest version doesn't work on the HP PRIME emulator. Where am I doing wrong?
|
|||
05-19-2021, 06:35 PM
Post: #13
|
|||
|
|||
RE: HP PPL and Statistical distributions
Hi, robmio
Your screenshot was the one I get, without LOCAL s,z; I did a emulator reset, to be sure: build 2.1.14181 (2018 10 16) Then, load back quad and SRP_qkv ... Everything is working fine. After running SRP_qkv(4.511, 8, 12) again. These are in Cas variables: SRP_qkv (Function) 2KB quad (Function) 0.25KB quad6 (Function) 9KB Paradoxically, with hard reset, LOCAL s,z; is not needed anymore. Code: #cas // Studentized P_Value, from q,k,v |
|||
05-19-2021, 07:05 PM
Post: #14
|
|||
|
|||
RE: HP PPL and Statistical distributions
I followed your instructions - it works fine!
|
|||
05-20-2021, 12:40 PM
Post: #15
|
|||
|
|||
RE: HP PPL and Statistical distributions
Now that we have it running, we may optimize a bit.
Inner integral, \(\int_{-∞}^∞\), work best if bulk of area inside \(\int_{-d}^d\) (quad defaulted d = 1) Inner integrand is bell-shaped, with center shifted left, we like to expand d = 4 (*) One way is simple substitution: \(\int_{-∞}^∞ f(x)\,dx = \int_{-∞}^∞ f(4u)(4\,du) \) Now, LHS \(\int_{-4}^4\) = RHS \(\int_{-1}^1\) This is the reason quad6() exposed other parameters. We can simply set d = 4. Code: #cas // Studentized P_Value, from q,k,v Cas> SRP_qkv(4.511, 8, 12) → 0.100017379208, finished in 0.11 second --- (*) without shift, quad(z->exp(-z^2/2), -inf, inf) work better with d=8 However, if shifted, it might get worse, as d=9 example shows. lua> Q = require'quad' lua> f = function(z) return exp(-z*z/2) end lua> Q.quad(Q.count(f), -huge, huge, 1), Q.n 2.506628274630998 119 lua> Q.quad(Q.count(f), -huge, huge, 2), Q.n 2.506628274630838 101 lua> Q.quad(Q.count(f), -huge, huge, 4), Q.n 2.506628274630999 43 lua> Q.quad(Q.count(f), -huge, huge, 8), Q.n 2.506628274630997 33 lua> Q.quad(Q.count(f), -huge, huge, 9), Q.n 2.506628274630931 55 |
|||
05-20-2021, 04:10 PM
Post: #16
|
|||
|
|||
RE: HP PPL and Statistical distributions
Optimization cuts execution times in half. thank you very much
|
|||
05-22-2021, 12:50 PM
Post: #17
|
|||
|
|||
RE: HP PPL and Statistical distributions
A nice thing about infinity, you can compress it, shift it, and it is still infinity.
cdf function, Φ(z) = 1/2 * (1+erf(z/√2)). We can substitute x=√2*z, and avoid calls to Φ(z), which internally called erf. XCas moyal.cc Wrote:static gen normal_cdf(const gen & g,GIAC_CONTEXT){ The substitution also simplified exp(-z*z/2) to exp(-x*x) Do the same way for outer integral, letting y = √2*s, and pull out the constants, we have: (d adjusted to offset effect of scaling: 4/√2 ≈ 2.8, 1/√2 ≈ 0.71) Code: #cas // Studentized P_Value, from q,k,v Cas> SRP_qkv(4.511, 8, 12) → 0.100017379208, finished in 0.085 second |
|||
05-22-2021, 05:18 PM
Post: #18
|
|||
|
|||
RE: HP PPL and Statistical distributions
Thanks a lot to Albert Chan who reduced the computation time to 1 second / 1.2 seconds (HP PRIME G2).
In some posts I have noticed that the HP PRIME calculator becomes even faster if programmed in PYTHON language. Therefore, I tried to translate Albert Chan's quad6-program from #Cas ... #end to PYTHON. However, I was unable to get the program to work - where am I wrong? Below the program: Code:
|
|||
05-22-2021, 06:46 PM
Post: #19
|
|||
|
|||
RE: HP PPL and Statistical distributions
Some issues with PYTHON program (assumed it is true Python):
1. Turning repeat until to while loop must ensure it run at least once. Loop counts must also match. It might be safer translate repeat-until as: while 1: ... if until_condition: break 2. True Python does not turn 1./0. to +inf, but raise ZeroDivisionError instead. Why Python's ZeroDivisionError for floating-point type is a bad and unnecessary feature In fact, Python seldom return float('inf'), for floating math operations. (1e300*1e300 return +inf, but 1e300 ** 2 raise OverflowError ... weird) >>> from math import * >>> log(0) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: math domain error >>> exp(1000) Traceback (most recent call last): File "<stdin>", line 1, in <module> OverflowError: math range error We have to catch exceptions instead. see emece67 double_exponential.py This is why quad.lua translated to PPL. It behave closer to Lua: CAS> 1./0. → +Inf CAS> log(0.) → -Inf CAS> exp(1000.) → +Inf |
|||
05-23-2021, 01:32 PM
Post: #20
|
|||
|
|||
RE: HP PPL and Statistical distributions
PPL allowed comparing complex numbers with symbolic inf (not DOM_float inf = 1.0/0)
CAS> inf > 3+4i → 1 CAS> -inf > 3+4i → 0 CAS> type(approx(inf)) → DOM_SYMBOLIC This feature extended quad for complex limits, without changes CAS> f(z) := (2/√π) *e^(-z*z) CAS> erf(3+4i) → -120.186991395-27.7503372936*i CAS> quad(f, 0, 3+4i)(1) → -120.186991395-27.750337294*i CAS> quad(f, 3+4i, inf)(1) → 121.186991394+27.7503372904*i = 1 - erf(3+4i) CAS> quad(f, inf, 3+4i)(1) → -121.186991394-27.7503372904*i = erf(3+4i) - 1 |
|||
« Next Oldest | Next Newest »
|
User(s) browsing this thread: 4 Guest(s)