FFT Multiplication (HP-48G/GX/G+)
|
02-10-2014, 09:25 PM
Post: #1
|
|||
|
|||
FFT Multiplication (HP-48G/GX/G+)
This is an RPL adaptation of Valentin Albillo's FFTMULT program for the HP-71B:
DatafileVA013. (HP-71B Fantastic FOUR).pdf (Size: 109.47 KB / Downloads: 51) That's a straightforward adaptation, except that I've used IFFT which the HP-71B lacks. Code:
Here is an example from Valentin's article: { 5476 1407 } { 8613 2724 } FFTMULT --> { 4716 7491 5498 2668 } This matches Valentin's result. However, I've found a discrepancy in this example: { 1234 5678 9012 } { 3456 7890 } FFTMULT --> { 426 7639 7023 2002 4680 } This should have been { 426 7640 7023 2002 4680 }, as we would get using Valentin's algorithm: 10 OPTION BASE 1 @ DIM X(3),Y(2),Z(5) 15 READ X,Y 20 CALL FFTMULT(X,Y,Z) @ MAT DISP Z 100 END 200 DATA 1234,5678,9012,3456,7890 900 SUB FFTMULT(A(),B(),C()) @ I=UBND(A,1) @ J=UBND(B,1) 910 K=I+J @ COMPLEX X(I),Y(J) @ MAT X=A @ MAT Y=B 920 N=LOG2(MAX(I,J)) @ E=10000 @ N=2^(INT(N)+(FP(N)#0)) 930 M=2*N @ COMPLEX X(M),Y(M),Z(M) @ MAT X=FOUR(X) 940 MAT Y=FOUR(Y) @ FOR I=1 TO M @ Z(I)=X(I)*Y(I) @ NEXT I 950 COMPLEX Z(M,1) @ MAT Z=TRN(Z) @ MAT Z=FOUR(Z) 960 MAT Z=TRN(Z) @ MAT Z=(M*M)*Z @ COMPLEX Z(M) 965 MAT DISP Z @ DISP 970 FOR I=1 TO K @ C(I)=IROUND(REPT(Z(I))) @ NEXT I 980 FOR I=K-1 TO 1 STEP -1 @ J=I+1 @ C(J)=C(J)+MOD(C(I),E) 990 C(I)=C(I) DIV E+C(J) DIV E @ C(J)=MOD(C(J),E) @ NEXT I >RUN (4264704.00002,-0) (29359428.0003,.000000128) (75944892,-0) (71104680,.000000512) (-.000016,-0) (-.00025024,-.000000128) (0,-0) (-.0002352,-.000000512) 426 7640 7023 2002 4680 As a comparison, the first element of the inverse Fourier transform is computed as (4264703.99998, 0) by the HP-48 calculator, which suggests is should be actually (4264704, 0). Perhaps the addition of a suitable quantity to the real part of these values should fix the problem and allow for chunks of 5 digits, but I haven't tried it yet. Another issue is speed. My HP-48GX is about 50% slower than the HP-71B when multiplying two 512-digit numbers (173 seconds versus 117 seconds). But of course the RPL program can be improved, as you can see by the code. Anyway, when these issues are solved (all of you who are interested are invited to try) this can make for a nice tool to compute pi to a couple of thousand digits, for instance. |
|||
02-10-2014, 10:02 PM
(This post was last modified: 02-11-2014 08:16 AM by Gerson W. Barbosa.)
Post: #2
|
|||
|
|||
RE: FFT Multiplication (HP-48G/GX/G+)
(02-10-2014 09:25 PM)Gerson W. Barbosa Wrote: . My HP-48GX is about 50% slower than the HP-71B when multiplying two 512-digit numbers (173 seconds versus 117 seconds). Two 256-digit numbers, actually, giving a 512-digit result. Sorry for the mistake. (02-10-2014 09:25 PM)Gerson W. Barbosa Wrote: However, I've found a discrepancy in this example: This should be fixed by inserting the following instructions between RE and IP in the code above, at the end of the second line after the IFFT command: Code:
More tests have to be made, though. P.S.: .05 is better. No problem with 640-digit results (128 groups of 5 digits, constants replaced with 100,000), at least for the few examples I tried. P.P.S.: Replacing RE IP with RE 0 RND is even better. Actually that's what should've been done in the first place if somehow I hadn't mistaken IROUND with IP. Since the HP-48 built-in IFFTis more accurate than the implementation using FOUR on the HP-71B it is possible groups of 5 digits rather than 4 can be used for up to 1000-digit results. A few empirical tests, as described in the last page of Valentin's article, should be made before we can be sure, however. Here is the fixed code: Code:
|
|||
02-11-2014, 09:04 AM
Post: #3
|
|||
|
|||
RE: FFT Multiplication (HP-48G/GX/G+)
Hi,
(02-10-2014 10:02 PM)Gerson W. Barbosa Wrote:(02-10-2014 09:25 PM)Gerson W. Barbosa Wrote: . My HP-48GX is about 50% slower than the HP-71B when multiplying two 512-digit numbers (173 seconds versus 117 seconds). You could speed up things by trying to avoid object type changes where possible. The snippet below shows some time-consuming actions from your code. (02-10-2014 10:02 PM)Gerson W. Barbosa Wrote: If you work with arrays anyway, you could try to avoid building lists. Creating and decomposing lists is very time-consuming. If possible, meta obs are recommended. I didn't dive into the code too deeply, but I'm sure the functionality can be made faster on an HP 48GX than any code giving the same results on an HP-71B;-) -- Ray |
|||
02-11-2014, 03:34 PM
Post: #4
|
|||
|
|||
RE: FFT Multiplication (HP-48G/GX/G+)
(02-11-2014 09:04 AM)Raymond Del Tondo Wrote: Hi, Thanks for your tips! I'd rather use vectors only, but lists are more convenient for certain tasks (or I just don't know how to do them using vectors). (02-11-2014 09:04 AM)Raymond Del Tondo Wrote: I didn't dive into the code too deeply, but I'm sure the functionality can be made faster on an HP 48GX than any code giving the same results on an HP-71B;-) I was disappointed with the HP-48GX, but now I see I can blame it on my sloppy adaptation. It surely can do better than this: # Groups Digits (results) Time 71B Time 48GX 16 128 24.11 14.39 32 256 53.41 40.95 64 512 116.74 199.96 Also, I suspect the performance decrease as the number of digits grow is due to lack of memory, as I have only 12 K left (and only one HP-48GX). Regards, Gerson. |
|||
02-11-2014, 09:22 PM
Post: #5
|
|||
|
|||
RE: FFT Multiplication (HP-48G/GX/G+)
I agree with Raymond, that using other data types improves the speed significantly.
But what about using SystemRPL? Using SystemRPL throwing away the OOP overhead and protection stuff improving also the execution speed. It's never too late to learn SystemRPL. ;-) Moreover, without SystemRPL we may not have emulators like Emu28, Emu42 and Emu71/Win. I began working on Emu48 in 1997 because I wrote SystemRPL programs since 1995 and tested them on my real HP48SX. After each of such a debugging session I always had to restore the memory content. With Kermit this took ~30 minutes. So I was looking for a simple way to restore data quickly. The problem was, that Emu48 0.99 and Emu48 1.0 wasn't working very well on Windows NT4, the story began... |
|||
02-11-2014, 10:48 PM
Post: #6
|
|||
|
|||
RE: FFT Multiplication (HP-48G/GX/G+)
(02-11-2014 09:22 PM)Christoph Giesselink Wrote: I agree with Raymond, that using other data types improves the speed significantly. Hey, I like that one, "It's never too late to learn XXXX". Much better than "Life is too short to learn German" (who said that?) :-) I regret never having tried to learn Saturn Assembly or SysRPL back in the day. Now that the HP-48 has been discontinued and the HP-50g is fast enough I don't feel motivated. But I have a notion of what I missed... Sticking to User-RPL I have started to rewrite the program following Raymond's advice. Indeed the delay is due to switching to and from lists and vectors. FFT on the HP-48GX takes about 1/6 of the time it takes FOUR on the HP-71B, exactly what one should expect just considering the clock rates of both devices. I've only just begun and running time has dropped from 200 to 175 seconds for 256-digit multiplications. The final array to vector conversion alone takes 15 seconds: Code:
I wasn't aware there was an array re-dimension command, RDM. Sometimes is pays to read the manual :-) (02-11-2014 09:22 PM)Christoph Giesselink Wrote: Moreover, without SystemRPL we may not have emulators like Emu28, Emu42 and Emu71/Win. I began working on Emu48 in 1997 because I wrote SystemRPL programs since 1995 and tested them on my real HP48SX. After each of such a debugging session I always had to restore the memory content. With Kermit this took ~30 minutes. So I was looking for a simple way to restore data quickly. The problem was, that Emu48 0.99 and Emu48 1.0 wasn't working very well on Windows NT4, the story began... I should thank you again for all those emulators, especially Emu48. Too used to RPN, I would make mistakes all the time when using the Windows calculator at work. I even started to write an RPN calculators in Delphi (my only attempt at the language) and almost succeeded, only the EEX key was left not implemented. But then I found Emu48 which became my only calculator on the PC :-) Best regards, Gerson. |
|||
02-12-2014, 02:41 AM
Post: #7
|
|||
|
|||
RE: FFT Multiplication (HP-48G/GX/G+)
HP 50g version:
Code:
[ 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. ] [ 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. ] << FFTMULT>> TEVAL --> [ 152. 4157. 8753. 2388. 2031. 7024. 9644. 8712. 3910. 9892. 536. 5036. 5790. 2759. 1428. 1360. 7669. 5626. 2319. 7684. 9548. 8493. 3211. 4009. 1428. 1360. 4103. 333. 3307. 4227. 4994. 6657. 5186. 7094. 5886. 2981. 7065. 9961. 6777. 9305. 8945. 2828. 7669. 5630. 824. 5695. 8561. 1954. 2703. 8562. 9452. 8278. 4583. 1430. 344. 4602. 6462. 4297. 1236. 926. 8341. 7164. 2127. 7250. 9916. 1715. 5512. 8798. 8036. 8848. 4621. 2474. 6157. 5981. 3729. 6150. 4278. 3114. 2837. 9826. 2399. 247. 1946. 3502. 519. 7380. 1054. 7177. 8640. 4513. 163. 853. 6761. 1645. 9271. 4529. 4881. 8778. 8379. 8205. 3002. 5911. 7488. 1881. 1123. 3044. 6596. 5556. 9244. 177. 5704. 9232. 7364. 7310. 4813. 2908. 5485. 4443. 3921. 6584. 3606. 1576. 3030. 260. 1726. 8709. 2138. 3936. ] s:41.0499 The version using <<*>> DOLIST instead of HADAMARD for element by element vector multiplication is about 2 seconds faster for numbers this size, despite vector to list to vector conversions. At this point double input sizes imply in triple running times, which is strange. Sure the final part needs more optimization, but I don't see why this behavior occurs. |
|||
02-12-2014, 12:51 PM
Post: #8
|
|||
|
|||
RE: FFT Multiplication (HP-48G/GX/G+)
Quote:Hey, I like that one, "It's never too late to learn XXXX". Much better than "Life is too short to learn German" (who said that?) :-) Here's a flippant off-subject quote that may shed some light to this issue :-) One of my favorites essays from Mark Twain, if I may add - Sam Clemens at his best. http://en.wikipedia.org/wiki/The_Awful_German_Language Auf wieder-lessen! "To live or die by your own sword one must first learn to wield it aptly." |
|||
02-12-2014, 07:09 PM
(This post was last modified: 02-12-2014 07:11 PM by Han.)
Post: #9
|
|||
|
|||
RE: FFT Multiplication (HP-48G/GX/G+)
Gerson's post already alluded to this, but there is a HUGE difference between computing with integers and non-integers on the HP50G. Since the HP49G, integers were considered a new object. That is, 2 is very different from 2. from an object-type point of view. So you will see a dramatic increase in speed just by switching your input from integer values to "integers-with-a-trailing-decimal-point."
The code Gerson posted could be made even faster by changing all the integers to decimals. 1 -> 1. and 2 -> 2. etc. Graph 3D | QPI | SolveSys |
|||
02-12-2014, 07:38 PM
Post: #10
|
|||
|
|||
RE: FFT Multiplication (HP-48G/GX/G+)
(02-12-2014 07:09 PM)Han Wrote: Gerson's post already alluded to this, but there is a HUGE difference between computing with integers and non-integers on the HP50G. Since the HP49G, integers were considered a new object. That is, 2 is very different from 2. from an object-type point of view. So you will see a dramatic increase in speed just by switching your input from integer values to "integers-with-a-trailing-decimal-point." You're right! From 41 down to 30 seconds in the above example, no matter the 50g is in exact or approximate mode. Thanks for reminding! (02-12-2014 07:09 PM)Han Wrote: The code Gerson posted could be made even faster by changing all the integers to decimals. 1 -> 1. and 2 -> 2. etc. The easier way to do this is copying the program to the stack and saving it again in approximate mode. All integer constants will automatically change to float. Regards, Gerson. |
|||
02-12-2014, 08:09 PM
(This post was last modified: 02-12-2014 08:16 PM by Gerson W. Barbosa.)
Post: #11
|
|||
|
|||
RE: FFT Multiplication (HP-48G/GX/G+)
(02-12-2014 12:51 PM)Ángel Martin Wrote:Quote:Hey, I like that one, "It's never too late to learn XXXX". Much better than "Life is too short to learn German" (who said that?) :-) Hola, Ángel! I knew the story already, but thanks for the link! I don't think Mark Twain's complaints are fair, I am sure he tried to learn German through a completely wrong method, that is, paying too much attention to grammar matters. I learned English in five months when I was 15, attending 1-hour classes twice a week (ok, two more semesters to consolidate it and still learning it to date :-) During those 18 months no word about Grammar from my Belgian teacher, no books, no copybooks -- just talking. I only regret not having attended his French and German classes, but then again that would have been too expensive for my parents. ¡Hasta la leída! Gerson. |
|||
02-13-2014, 01:16 AM
Post: #12
|
|||
|
|||
RE: FFT Multiplication (HP-48G/GX/G+)
(02-12-2014 02:41 AM)Gerson W. Barbosa Wrote: [ 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. ]Hi, I just made a non-optimized SysRPL version derived from your previous prog version, which runs the above arrays in about 67 seconds on a normal HP 48GX. There's _much_ room for streamlining, which I may do tomorrow;-) Cheers Raymond -- Ray |
|||
02-13-2014, 03:16 AM
(This post was last modified: 02-13-2014 03:19 AM by Gerson W. Barbosa.)
Post: #13
|
|||
|
|||
RE: FFT Multiplication (HP-48G/GX/G+)
(02-13-2014 01:16 AM)Raymond Del Tondo Wrote:(02-12-2014 02:41 AM)Gerson W. Barbosa Wrote: [ 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. 1234. 5678. 9012. 3456. ]Hi, I just made a non-optimized SysRPL version derived from your previous prog version, which runs the above arrays in about 67 seconds on a normal HP 48GX. In order to avoid lists I tried the following in the latest HP-48GX program: Code:
67 seconds is pretty good considering the hp 50g does is 30 seconds. Cheers, Gerson. |
|||
02-14-2014, 12:07 AM
Post: #14
|
|||
|
|||
RE: FFT Multiplication (HP-48G/GX/G+)
(02-13-2014 01:16 AM)Raymond Del Tondo Wrote: Hi, I just made a non-optimized SysRPL version derived from your previous prog version, which runs the above arrays in about 67 seconds on a normal HP 48GX. Hello Raymond, You might want to try the latest UserRPL version: Code:
It does the same in about 52 seconds, timed with this program: Code:
The previous version took about 175 seconds. No significant gain on the hp 50g with this new version, though (29 s --> 26.5 s). Code:
Thanks for you suggestions and advice! Gerson. |
|||
02-14-2014, 12:34 AM
Post: #15
|
|||
|
|||
RE: FFT Multiplication (HP-48G/GX/G+)
(02-13-2014 03:16 AM)Gerson W. Barbosa Wrote: 67 seconds is pretty good considering the hp 50g does is 30 seconds.Hi, now my SysRPL version runs in about 22.9 seconds for the two 128 element arrays on a normal HP 48GX. The most time-consuming parts are now the two FFT and the IFFT calls. The final loop, which was the slowest part in the original program, now needs about 3 seconds only;-) One could squeeze out another 2 seconds by using integer arithmetic in the final loop if applicable... I'll post the commented SysRPL source tonight or tomorrow. Cheers Raymond -- Ray |
|||
02-14-2014, 01:31 AM
(This post was last modified: 02-14-2014 01:45 AM by Gerson W. Barbosa.)
Post: #16
|
|||
|
|||
RE: FFT Multiplication (HP-48G/GX/G+)
(02-14-2014 12:34 AM)Raymond Del Tondo Wrote: The most time-consuming parts are now the two FFT and the IFFT calls. I think this has been fixed in my lastest version above, but optimization is still possible in the final loop, I admit. It appears the use of the lists, especially in the final loop, was causing the lack of linearity in the size vs. running times table we would expect from FFT multiplication. The previous hp 50g program would not complete a 128-element size multiplication, at least not under 30 minutes, when I quit waiting. Now it takes only 65 s. Here is a more complete table: Code:
Perhaps 5-digit groups are possible up to that size, considering IFFT is performed with full-accuracy. If not in UserRPL at least in your SysRPL version. This is worth checking. Cheers, Gerson. |
|||
02-14-2014, 10:37 PM
(This post was last modified: 02-15-2014 11:27 PM by Raymond Del Tondo.)
Post: #17
|
|||
|
|||
RE: FFT Multiplication (HP-48G/GX/G+)
Hi,
as promised, here's the beef:-) As written, the code runs in ~22.9 seconds for the two arrays with 64 elements each. Have fun;-) Code: ***************************************************************************** [Edited to correct binary (no debug info) and element number (64 instead of 128)] Cheers Raymond -- Ray |
|||
02-15-2014, 01:22 AM
Post: #18
|
|||
|
|||
RE: FFT Multiplication (HP-48G/GX/G+)
(02-14-2014 10:37 PM)Raymond Del Tondo Wrote: Hi, Thanks for the UserRPL version. Now, the question is: how can those of us not familiar with UserRPL compile this? I tried Debug4x but HEADER.H is missing (assuming this is a valid tool). Cheers, Gerson. |
|||
02-15-2014, 02:15 AM
(This post was last modified: 02-15-2014 11:17 PM by Raymond Del Tondo.)
Post: #19
|
|||
|
|||
RE: FFT Multiplication (HP-48G/GX/G+)
Hi Gerson,
you can comment out the reference to Header.h . I used a personalised version of the standard header file during development, but the "release" SysRPL source is compilable without the header ref. I only use the HP CLI tools in the 32 bit version. Attached is the original source file and the binary for the HP 48G series. The source can be viewed best using an editor which supports mono-spaced fonts, like WordPad, (better) UltraEdit or (best, my choice) TSE Pro 32 . Cheers -- Ray |
|||
02-15-2014, 08:26 PM
(This post was last modified: 02-15-2014 08:31 PM by Gerson W. Barbosa.)
Post: #20
|
|||
|
|||
RE: FFT Multiplication (HP-48G/GX/G+)
(02-14-2014 10:37 PM)Raymond Del Tondo Wrote: Hello, Raymond! On my HP-48GX (and on the emulator) things go like this: [%] [%]' --> ENTER --> [%]". Is this behavior normal? Here is the running times table, times in seconds: Code:
(*) SysRPL times have been measured with a chronometer because I wasn't able to do it programmatically. The HP-48GX times have decreased because now I have 54 K RAM free (previously I had only 11 K). The 49/50g programs are faster because I am using ...DOLIST... instead of HADAMARD: Code:
The previous version using HADAMARD is more compact (299.5 bytes), but gets slower as the arrays grow larger: Code:
Cheers, Gerson. |
|||
« Next Oldest | Next Newest »
|
User(s) browsing this thread: 3 Guest(s)