(42S) matrix permanent

01152024, 12:33 PM
(This post was last modified: 01152024 04:24 PM by Werner.)
Post: #1




(42S) matrix permanent
Thanks to:
usage: with the square matrix in stack register X, XEQ "PER". The permanent is returned in X, and the original, unchanged matrix is in Y. This is important for lowmemory conditions on the 42S. variables used: REGS : v (sum of rows) n : matrix order i : 2^(n1) k : 1..i1 p : permanent 00 { 147Byte Prgm } 01▸LBL "PER" 02 ENTER 03 RSUM 04 X=Y? 05 DET 06 REAL? 07 RTN 08 STO "REGS" 09 DIM? 10 R↓ 11 STO "n" 12 2 13 X<>Y 14 DSE ST X 15 Y^X 16 STO "i" 17 STO "k" 18 R↓ 19 EDIT 20 CLX 21 GTO 14 22▸LBL 10 23 RCL "i" 24 RCL "k" 25 1 26▸LBL 11 27 RCL ST Y 28 4 29 MOD 30 GTO IND ST X 31▸LBL 01 32▸LBL 03 33 STO+ ST X @ (1,3) > (2,2) 34 LASTX 35  36 GTO 04 37▸LBL 00 38 X<> ST L 39 STO÷ ST Z @ k := k/4; 40 SQRT @ j := j + 2; 41 + 42 GTO 11 43▸LBL 02 44 SIGN @ j := j + 1; 45 + 46 X<>Y 47 8 48 MOD @ k MOD 8 is 2 or 6 49 4 @ (2,6) > (2,2) 50  51▸LBL 04 52 RCL "n" 53 1 54 R^ 55 STOIJ 56 R↓ 57 GETM 58 × 59 STO+ "REGS" 60 RCL "p" 61▸LBL 14 62 RCL "n" 63 RCL 00 64 DSE ST Y 65▸LBL 13 66 RCL× IND ST Y 67 DSE ST Y 68 GTO 13 69 RCL ST Z 70 STO "p" 71 DSE "k" 72 GTO 10 73 RCL÷ "i" 74 +/ 75 RCLEL 76 EXITALL 77 X<>Y 78 END Timing/testing the program is easy, once you realize that the permanent of a matrix of order n, filled with all 1's, is n!. timing: order 42S DM42 5 0:18 0.05s 7 1:31 0.22s 9 7:10 1.03s 12 1:06:15 9.39s The timing of order 12 for the 42S is but an estimate, fitting a logarithmic curve through the three data points (x=time/n, y=n), yielding a .99999.. correlation, so should be pretty good. Cheers, Werner [updated with a shorter version... I seem to always find simple ways of shortening programs, moments after I posted them] 41CV†,42S,48GX,49G,DM42,DM41X,17BII,15CE,DM15L,12C,16CE 

01152024, 02:24 PM
Post: #2




RE: (42S) matrix permanent
Hi, Werner
Perhaps you can add some comments to code? Example, why does it use SQRT ? Does it have something to do with j = bit.ffs(k) ? With decimal machine, getting bit.ffs (Find First Set) is expensive. My guess is it would cost way more than these lines (j = f[1]) for next loop. f[1] = 1 f[i] = f[i+1] f[i+1] = i+1 

01152024, 02:45 PM
(This post was last modified: 01162024 04:54 PM by Werner.)
Post: #3




RE: (42S) matrix permanent
Hi, Albert!
The SQRT is a quick way to turn a 4 into a 2 ;) The only part, I think, that requires some explanation is lines 2351. There, I determine j and dj (+/2) from k, as John Keith explained. So, j is the place of the least significant bit in k. Let >>k be k shifted right till the least significant bit is 1, then dj=2 if >>k MOD 4 equals 3, else dj=2 (I *add* dj*Mj, so my signs are swapped). I also technically calculate the permanent of the transposed matrix, which is of course the same  to save a TRANS(M) which on a 42S will duplicate the matrix. I subsequently fetch the jth *column* of M instead of the row. Instead of trying to find the least significant bit of k, successively dividing by 2, I perform k MOD 4 right away, and use GTO IND ST X to branch off:  half of the time, it is either 1 or 3 and we're done (j=1 and dj=2 for 1 and 2 for 3).  if it is 0, divide k by 4, add 2 to j and loop  if it is 2, add 1 to j, and k MOD 8 is now 2 or 6, and dj 2 or 2, respectively I had a version using v,f and d vectors (even one using only the stack), but this latest version runs almost twice as fast, also because I store the vector v in REGS, so that I don't need to switch indexing/editing matrices, and the calculation of product(v) is faster. Cheers, Werner 41CV†,42S,48GX,49G,DM42,DM41X,17BII,15CE,DM15L,12C,16CE 

01152024, 04:06 PM
(This post was last modified: 01152024 04:17 PM by Albert Chan.)
Post: #4




RE: (42S) matrix permanent
(01152024 02:45 PM)Werner Wrote: dj=2 if >>k MOD 4 equals 3, else dj=2 (I *add* dj*Mj, so my signs are swapped). If matrix cells are nonnegative, product of initial v (column sums) is biggest. This is the reason my code does v[j] = v[j]  d[i]*M[i][j] Trivia: If n×n matrix are all ones, its' permanent = n! First term alone give upper limit: n^n / 2^(n1) = 2 * (n/2)^n For comparison, Stirling's approximation: n! ~ √(2*pi*n) * (n/e)^n n=1 > 2*(n/2)^n = 1 = 1! n=2 > 2*(n/2)^n = 2 = 2! n=3 > 2*(n/2)^n = 6.75 > (6 = 3!) n=4 > 2*(n/2)^n = 32 > (24 = 4!) ... Quote:Instead of trying to find the least significant bit of k, successively dividing by 2, I perform k MOD 4 right away ... Nice optimizing trick! Quote:I had a version using v,f and d vectors (even one using only the stack), but this latest version runs almost twice as fast. Interestingly, LuaJIT code is faster (almost twice as fast) using v,f and d vectors. This despite I already use C to code bit.ffs and bit.test. It is perhaps due to LuaJIT have only 1 kind of number = IEEE double. Bit manipulations required conversion of double to integer, then back to double. Guessing which way is faster is hard. We have to code it to know it. 

01152024, 04:22 PM
Post: #5




RE: (42S) matrix permanent
(01152024 04:06 PM)Albert Chan Wrote: If matrix cells are nonnegative, product of initial v (column sums) is biggest.? That's what I do as well, but the sign of the d[i]'s are inversed in my code  and btw you don't have a choice, you have to loop over all prod(sum(+/M[i][j])), and d[j] being 2 means you swap a +M[i][j] for a M[i][j]. Quote:Guessing which way is faster is hard. We have to code it to know it. In the case of the 42S, having to access 4 arrays (M,f,v,d) meant a lot of indexing, as you can have only one matrix indexed/edited at a time  save if you store one in REGS and access it with RCL IND .. Cheers, W. 41CV†,42S,48GX,49G,DM42,DM41X,17BII,15CE,DM15L,12C,16CE 

01162024, 01:00 AM
Post: #6




RE: (42S) matrix permanent
(01152024 12:33 PM)Werner Wrote: timing: XCas> Digits := 5 XCas> N := float([5,7,9]) XCas> T := float([18,91,430]) XCas> correlation(N, ln(T/N)) → 0.99999 XCas> c := linear_regression(N, ln(T/N)) → (0.64641,1.954) ln(t/n) = c[0]*n + c[1] t/n = exp(c[1]) * exp(c[0])^n > time (sec) ≈ 0.14170 * n * 1.9087^n > n=12 time ≈ 3976 sec = 1:06:16 Fit is excellent, but formula does not make sense. We would expect time proportional to number of terms = 2^(n1) Each term, time to update v's and get its product, plus getting j and d. Below does not produce as good correlation, but make more sense. XCas> correlation(N, T/2^N) → 0.99917 XCas> c := linear_regression(N, T/2^N) → (0.069336,0.21908) t/2^n = c[0]*n + c[1] = c[0] * (n + c[1]/c[0]) > time (sec) ≈ 0.069336 * (n + 3.1596) * 2^n > n=12 time ≈ 4305 sec = 1:11:45 

01162024, 08:47 AM
Post: #7




RE: (42S) matrix permanent
(01162024 01:00 AM)Albert Chan Wrote: Fit is excellent, but formula does not make sense. With only 3 sample points, you can't expect miracles. I won't burn through a set of batteries to see which estimate proves to be better, probably yours ;) Werner 41CV†,42S,48GX,49G,DM42,DM41X,17BII,15CE,DM15L,12C,16CE 

01162024, 12:24 PM
Post: #8




RE: (42S) matrix permanent
We can use DM42 timings, without n=12 numbers.
Then, we predict n=12 timing, compare against actual time (9.39 sec) N := [5,7,9] T := [.05, .22, 1.03] Linear regression, N vs ln(T/N) > correlation = 0.99939 > time (sec) = 0.00046355 * n * 1.8393^n > n=12 time ≈ 8.34 sec Linear regression, N vs T/2^N > correlation = 0.98491 > time (sec) = 0.00011230 * (n + 8.7101) * 2^n > n=12 time ≈ 9.53 sec 

01172024, 08:17 PM
Post: #9




RE: (42S) matrix permanent
(01152024 12:33 PM)Werner Wrote: Thanks to: ... and based on a Python/Numpy program posted by GitHub user "lesshaste", who deserves credit as well. Nice to see a version for Free42, I will try your optimization on my RPL program when i get the chance. 

01262024, 10:28 PM
Post: #10




RE: (42S) matrix permanent
(01172024 08:17 PM)John Keith Wrote: ... I will try your optimization on my RPL program when i get the chance. Well, I tried it but it made the program substantially larger and not much faster. RPL doesn't have anything like GTO IND ST X the closest would be either a CASE statement or nested IF..THEN..ELSE structures. The method might work well on the 71B using ON..GOTO or ON..GOSUB though. 

« Next Oldest  Next Newest »

User(s) browsing this thread: