Threaded Mode | Linear Mode

Gerson W. Barbosa · 02-15-2014, 01:22 AM

(02-14-2014 10:37 PM)Raymond Del Tondo Wrote: Hi,

as promised, here's the beef:-)

As written, the code runs in ~22.9 seconds for the two arrays with 128 elements each.

Have fun;-)

Code:

***************************************************************************** * Name : FFTMULT - FFT Multiplicator * Modultype : Secondary * Dest.Comp. : HP-48 * Language : System RPL * Author : Raymond Del Tondo * Interface : [%] [%]' --> [%]'' * * Notes : It will run on all versions of the HP48 G Series (tested on rev P-R). * * This program has been inspired by a UserRPL program by Gerson W. Barbosa, * which he published here: http://www.hpmuseum.org/forum/thread-649.html * * [ 5476 1407 ] [ 8613 2724 ] FFTMULT --> [ 4716 7491 5498 2668 ] * * [ 1234 5678 9012 ] [ 3456 7890 ] --> [ 426 7640 7023 2002 4680 ] * * The timings below are given for 2 arrays with 128 real numbers each. * See forum thread for actual arrays. * * Ed.Hist. Date Who What * 13.02.2014 RDT Start * 14.02.2014 RDT Speed up everything:-) ***************************************************************************** INCLUDE D:\RPL\TL\HEADER.H ASSEMBLE =ARRYLP_DO EQU #37BCB =PULLEL[S] EQU #3558E =RNDX# EQU #2B551 =XEQ>VEC EQU #1D02C =XEQNOT EQU #1E8D9 RPL DEFINE fft ROMPTR 0C4 000 DEFINE ifft ROMPTR 0C4 002 :: 2DUP ARSIZE SWAP ARSIZE ( *A B SizeB SizeA* ) 2DUP #MAX ( *A B SizeB SizeA MAX[SizeB SizeA]* ) UNROT ( *A B MAX[SizeB SizeA] SizeB SizeA* ) #+ ( *A B MAX[SizeB SizeA] SizeB+A* ) 1LAMBIND ( *Save away SizeB+A* ) UNCOERCE %LN %2 %LN %/ ( *A B LN[MxAB]/LN[2]* ) DUP %FP ( *A B LN[MxAB]/LN[2] FP[LN[MxAB]/LN[2]]* ) XEQNOT XEQNOT ( *A B LN[MxAB]/LN[2] FP[LN[MxAB]/LN[2]]'* ) SWAP %IP ( *A B FP[LN[MxAB]/LN[2]]' IP[LN[MxAB]/LN[2]]* ) %+ ( *A B [FP[LN[MxAB]/LN[2]]']+[IP[LN[MxAB]/LN[2]]]* ) %2 SWAP %^ ( *A B 2^[[FP[LN[MxAB]/LN[2]]']+[IP[LN[MxAB]/LN[2]]]]* ) DUP %+ ( *A B 2_x_2^[[FP[LN[MxAB]/LN[2]]']+[IP[LN[MxAB]/LN[2]]]]* ) %ABSCOERCE * From here: 2xDim := 2_x_2^[[FP[LN[MxAB]/LN[2]]']+[IP[LN[MxAB]/LN[2]]]] DUP ROT SWAP ( *A 2xDim B 2xDim* ) ONE{}N ( *A 2xDim B 2xDim* ) MATREDIM ( *A 2xDim B'* ) UNROT ( *B' A 2xDim* ) ONE{}N ( *B' A 2xDim* ) MATREDIM ( *B' A'* ) fft SWAP fft ( *fftA' fftB'* ) * Up to here the code runs about 11 seconds... ARRYLP_DO (DO) INDEX@ PULLEL[S] ( *fftA' fftB' C%* ) ROT ( *fftB' C% fftA'* ) INDEX@ PULLEL[S] ( *fftB' C% fftA' C%* ) ROT ( *fftB' fftA' C% C%* ) x* ( *fftB' fftA' C%'* ) UNROT ( *C%' fftB' fftA'* ) LOOP DROP ARSIZE * Up to here the code runs about 14 seconds... UNCOERCE XEQ>VEC ifft * Up to here the code runs about 19.8 seconds for new loop ARRYLP_DO (DO) INDEX@ PULLEL[S] DUPTYPECMP? IT C>Re% ( *Complex to real* ) ONE RNDX# ( *Cut off fractional part* ) SWAP LOOP ARSIZE *Here: %n..%1 #n 1GETLAM ( *%n..%1 #n SizeB+A* ) #- ( *%n..%1 #n-[SizeB+A]* ) NDROP ( *%n..%1 Dropped superfluous elements* ) 1GETLAM ( *%n..%1' SizeB+A* ) ONE_DO (DO) OVER ( *ob2 ob1 ob2* ) % 10000 %MOD %+ ( *ob2 ob1+MOD[ob2,10k]* ) SWAP % 10000 %/ %IP ( *ob1+MOD[ob2,10k] IP[ob2/10k]* ) OVER ( *ob1+MOD[ob2,10k] IP[ob2/10k] ob1+MOD[ob2,10k]* ) % 10000 %/ %IP %+ ( *ob1+MOD[ob2,10k] IP[ob2/10k]+IP[[ob1+MOD[ob2,10k]]/10k]* ) SWAP ( *IP[ob2/10k]+IP[[ob1+MOD[ob2,10k]]/10k] ob1+MOD[ob2,10k]* ) % 10000 %MOD ( *IP[ob2/10k]+IP[[ob1+MOD[ob2,10k]]/10k] MOD[ob1+MOD[ob2,10k],10k]* ) ISTOP@ UNROLL LOOP 1GETLAM UNROLL ( *%n'..%1'* ) 1GETABND ( *%n'..%1' SizeB+A* ) UNCOERCE XEQ>VEC ;

Cheers

Raymond

Thanks for the UserRPL version. Now, the question is: how can those of us not familiar with UserRPL compile this? I tried Debug4x but HEADER.H is missing (assuming this is a valid tool).

Cheers,

Gerson.