Puzzle - RPL and others - Printable Version +- HP Forums (https://www.hpmuseum.org/forum) +-- Forum: HP Calculators (and very old HP Computers) (/forum-3.html) +--- Forum: General Forum (/forum-4.html) +--- Thread: Puzzle - RPL and others (/thread-16722.html) Pages: 1 2 3 RE: Puzzle - RPL and others - Albert Chan - 04-29-2021 05:16 PM (04-28-2021 08:45 PM)Allen Wrote:  I propose a 63 Byte solution for 41c/42s with a slight rube-goldberg look to the code to minimize size, but maintains divisibility throughout. ... running total starts with -9 for various reasons ... Hi, Allen How does the code work ? I do not see any search, and the loop is basically counting digits. To understand it, I translate to Lua, trying to simplify as much as possible Code: function puzzle()      local a, b, c, total = 19, -9, -27, -9     for x = 3, 9 do         local y = ((a*x + b)*x + c)*x   -- horner([a,b,c], x)         total = total * 10         total = (total - total%x) + y   -- x divides total         print(x, y, total)     end     return total + 18 end lua> puzzle() 3 ﻿ ﻿ ﻿ ﻿ ﻿ 351 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 261 4 ﻿ ﻿ ﻿ ﻿ ﻿ 964 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 3572 5 ﻿ ﻿ ﻿ ﻿ ﻿ 2015 ﻿ ﻿ ﻿ ﻿ 37735 6 ﻿ ﻿ ﻿ ﻿ ﻿ 3618 ﻿ ﻿ ﻿ ﻿ 380964 7 ﻿ ﻿ ﻿ ﻿ ﻿ 5887 ﻿ ﻿ ﻿ ﻿ 3815525 8 ﻿ ﻿ ﻿ ﻿ ﻿ 8936 ﻿ ﻿ ﻿ ﻿ 38164184 9 ﻿ ﻿ ﻿ ﻿ ﻿ 12879 ﻿ ﻿ 381654711 381654729 Where does the cubic, 19x³ - 9x² - 27x, comes from ? Why total start at -9 ? Why total + 18 at the end ? RE: Puzzle - RPL and others - Allen - 04-29-2021 07:03 PM (04-29-2021 05:16 PM)Albert Chan Wrote:  How does the code work ? In having fun with different ways to calculate the final SSN, I wanted to play around with as simiple an inner loop as possible. Here's a more simplified 46-byte solution that's a smaller, faster, and clearer Code:  00 { 46-Byte Prgm } 01 19 02 STO 01 03 2                    04 STO 00           initialize varibles 05 STO× 01        ensures first 2 digits are multiple of 2 06 ISG 00 07>LBL 01 08 4 09 RCL 00 10 X!=Y? 11 SIGN 12 SQRT 13 RCL× 00     add digit if not equal 4, else add double the digit 14 RCL 01 15 10              multiply by 10 16 × 17 ENTER 18 ENTER 19 RCL 00 20 MOD          calculate difference to closest multiple of current digit 21 -                 22 +               add difference to running total 23 STO 01       PRX after this line if you want to see the number being constructed 24 ISG 00 25 X<>Y          NOP 26 RCL 00 27 × 28 LOG 29 9               terminate with 9 digit number 30 X>Y? 31 GTO 01 32 RCL 01       recall the answer 33 .END. If you take the central part of this code: Code:  15 RCL 01 16 10 17 × 18 ENTER 19 ENTER 20 RCL 00 21 MOD 22 - it assembles (left to right) a number that is divisible by the numbers 3 through 9. Unfortunately the naive way produces a number that is not pan-digital. Python Example here: Code:  m=3 for j in range(2,10):     m*=10     m+=j-m%j     print(m)      32 321 3212 32125 321252 3212524 32125248 321252489 With a few small corrections during the process one can trick the calculator to produce correct results using the naive algorithm: Code:  m=38                             <- start with 38 for j in range(3,10):         <-  start with 3     m*=10     m+=j-m%j     if j==4: m+=j              <-  add 4 if current variable is 4      print(m)      381 3816 38165 381654 3816547 38165472 381654729 (04-29-2021 05:16 PM)Albert Chan Wrote:  Where does the cubic, 19x³ - 9x² - 27x, comes from ? Why total start at -9 ? Why total + 18 at the end ? In my initial larger version, I was looking for a correction factor for the naive approach to make the number pandigital without modifying the actual structure of the program. 381654729 - 321252489 = 60402240 As it turns out, adding the results of the polynomial 19x³ - 9x² - 27x for 2# DUP #2-   BINT63 #> caseSIZEERR   DUPONE #AND #0<> case2DROP   WORDSIZE 1LAMBIND   ERRSET ::     BINT64 dostws BINT1 #>HXS     OVER ONE_DO       DUP ONE{}N 4UNROLL       DUP4UNROLL bitSL     LOOP     DROP' ::       OVER #2* #3+ GETLAM       4PICK bitAND UNROT       3GETLAM FPTR2 ^QMul 1GETLAM       3PICK 3PICKOVER FPTR2 ^#>Z       FPTR2 ^Mod FPTR2 ^Z># #-       DO         INDEX@ #2* #2+ GETLAM         4PICKOVER bitAND         OVER HXS==HXS %0=         ITE_DROP ::           bitNOT 5PICK bitAND           3PICK#1+_ 3PICK INDEX@           FPTR2 ^#>Z FPTR2 ^QAdd           OVER 1GETLAM #<>case           2GETEVAL           ROTDROPSWAP           #2* #2* #3- UNROLL         ;       OVER +LOOP       4DROP     ;     SWAP' NULLLAM OVER #2* #1+     NDUPN DOBIND     1GETLAM ONE_DO       1GETLAM INDEX@       BEGIN         SWAPOVER #/ DROP       #0=UNTIL       DROP #2* #3+ DUP GETLAM       INNERCOMP INDEX@ ROTOVER       #2* #2+ GETLAM bitOR       ROT#1+ {}N SWAP PUTLAM     LOOP     1GETLAM ONE_DO       INDEX@ #2* #3+ GETLAM       DUPTYPEHSTR?       ITE_DROP ::         INNERCOMP ONE_DO           DUPROT #2* #3+ PUTLAM         LOOP         DROP       ;     LOOP     BINT0 #>HXS bitNOT BINT1 Z0_     2GETEVAL     ABND   ; ERRTRAP ::     1GETLAM dostws ERRJMP   ;   1GETABND dostws ; (405.5 bytes, #C476h) The goal was to make the check for whether a digit is in the right bucket for the currently processed place as quick as I could: another bitset, which gets ANDed into the bitset of not-yet-taken digits before the digit-testing loop in the recursive subroutine starts. (My previous programs used a bitset of already-taken digits, but that's just an inversion away.) The preprocessing that goes into constructing the bucket bitsets is pretty quick too, since it doesn't use many loops or particularly slow commands, and especially not recursively nested loops. This new part is just a loop over all valid digits taking the GCD between it and the base, using an indefinite loop for a custom BINT version of the Euclidean Algorithm (since I didn't want slow back-and-forth conversion to a slower type which would likely also use that algoithm in ROM). With this GCD a list is recalled, the digit appended to it, a bitset at the end of the list updated by ORing the digit's single-bit bitset into it, and the resulting list stored back. A separate loop over the GCDs (actually over all digits, but everything that doesn't have a list gets ignored) and distributes the finished bitset to the digits in the list. This re-uses the same local variables for lists and extracted bitsets, but I made sure that's safe. If I comment out the part that sets up the three parameters for the outermost recursive subroutine call along with the call itself (i.e. leaving only this new setup part and the small setup which existed before), the time spent is around 0.16_s for bases 22 and 24 (the highest bases I tried so far). Compare that to how long the recursive part took: 150_s for base 24, which benefits greatly from all that bucket-sorting, and a staggering 1304_s (over 21 minutes!) for base 22, which does not benefit from it at all beyond the simple odd/even and ==base/2 checks. Good ol' base 10 takes just under 2.5 seconds. Other changes include switching the types employed for the under-construction number and bitsets from reals and BINTs to ZINTs and HXSs, respectively. ZINTs slow things down, but reals broke down partially in base 13 and fully in base 14 and beyond (because they only hold 12 decimal digits), and extended reals push this limit up by only 3. So infinite-precision integers are the way to go; I don't need a fractional part anyway. Similarly, BINTs have only 20 bits, so they only work as bitsets up to base 21; HXSs are theoretically infinite (though the ROM math routines only support up to 64 bits), but also slower. Still, slow results is better than wrong results - base 65 is the upper limit now. Without these type changes base 10 was in the neighborhood of 1 second. Oh, and I added an early exit in basically constant time for odd bases. No point checking those anymore, just return no results immediately. --- While climbing the ladder up to base 24, I also noticed that quite a few of the bigger even bases don't have solutions either. Assuming my code is correct, these are the solutions by base: - 2: $$1_2 = 1_{10}$$ - 4: $$123_2 = 27_{10}, 321_2 = 57_{10}$$ - 6: $$14325_6 = 2285_{10}, 54321_6 = 7465_{10}$$ - 8: $$3254167_8 = 874615_{10}, 5234761_8 = 1391089_{10}, 5674321_8 = 1538257_{10}$$ - 10: $$381654729_{10}$$, the original puzzle target - 12: nothing - 14: $$9C3A5476B812D_{14} = 559922224824157_{10}$$ - 16 to 24: nothing I would have liked to dump some Babylonian numerals on you eventually (base 60), but I don't think that's going to happen if I look at the TEVAL reports for 22 and 24. The preprocessing isn't too bad (without the recursive subroutine I measured it to take 0.44 seconds for base 60), but 60 levels of recursion with a loop in each surely is just asking too much from a tortured little pocket calculator. We'd need more shortcuts to make that even remotely feasible. RE: Puzzle - RPL and others - Albert Chan - 05-04-2021 03:29 AM (05-03-2021 03:43 PM)3298 Wrote:  ... a staggering 1304_s (over 21 minutes!) for base 22, which does not benefit from it at all beyond the simple odd/even and ==base/2 checks. Good ol' base 10 takes just under 2.5 seconds. There is also a mod-4 bucket, for even base n: n = 4k: ﻿ ﻿ ﻿ ﻿ d4 ≡ d8 ≡ ... ≡ d4k ≡ 0 (mod 4) ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ → d2 ≡ d6 ≡ ... ≡ d4k-2 ≡ 2 (mod 4) n = 4k+2: d4 ≡ d8 ≡ ... ≡ d4k ≡ 2 (mod 4) ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ → d2 ≡ d6 ≡ ... ≡ d4k+2 ≡ 0 (mod 4) Combined, we have the invariant: (n + 2i + d2i) ≡ 0 (mod 4) Code: def recurse3(lst, n, k=1, x=0):     if k==n: print x; return     x, d0, step = n*x, lst[k], k+k     d1 = k-x%k  # first valid (mod k)     if k&1:     # odd k         if not (d1&1): d1 += k  # d1 also odd     elif k==2:         d1 = 4 if (n&3) else 2     else:       # even k >= 4         bad = (n+k-d1) & 3         if k&3: d1 += bad and k # d1 (mod4) = n+k         elif bad: return        # can't fix by +k         else: step = k          # step (mod4) = 0     for d in xrange(d1, n, step):         try: i = lst.index(d,k)         except ValueError: continue         lst[i] = d0         recurse3(lst, n, k+1, x+d)         lst[i] = d      # restore >>> recurse3(range(10), 10) 381654729 With this, I confirmed there is no solution for 16 ≤ n ≤ 40 RE: Puzzle - RPL and others - 3298 - 05-04-2021 06:48 AM (05-04-2021 03:29 AM)Albert Chan Wrote:  There is also a mod-4 bucket, for even base n: n = 4k: ﻿ ﻿ ﻿ ﻿ d4 ≡ d8 ≡ ... ≡ d4k ≡ 0 (mod 4) ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ → d2 ≡ d6 ≡ ... ≡ d4k-2 ≡ 2 (mod 4) This one I have covered with my GCD partitioning, because the GCD between 4k and { 4, 8, ... 4k-4 } is obviously always 4 or some multiple of it. However, this: (05-04-2021 03:29 AM)Albert Chan Wrote:  n = 4k+2: d4 ≡ d8 ≡ ... ≡ d4k ≡ 2 (mod 4) ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ → d2 ≡ d6 ≡ ... ≡ d4k+2 ≡ 0 (mod 4) is new to me. Looking back, you already mentioned it back in your base-10-by-hand post, and I think I finally understand it now: divisibility by 4 in 4k+2 bases takes the last two digits, and the constraint that the first of these has to be odd (thanks to divisibility by 2 partitioning) leads to this result. This is an interesting improvement for 4k+2 bases, but I don't think it's generalizable further in a worthwhile manner, so 4k bases get no additional help. (05-04-2021 03:29 AM)Albert Chan Wrote:  With this, I confirmed there is no solution for 16 ≤ n ≤ 40 Conjecture: there are no solutions for N>14. No clue how to go about proving it though. RE: Puzzle - RPL and others - Albert Chan - 05-05-2021 06:29 PM I was unsatisfied with recurse3(). Although it added mod4 buckets, the search for duplicates is O(n), instead of previously O(1). x stored the "running total" of the solution, we could use lst for something else. This version use lst to track what were taken. (by marking the cell 0) Lookup for duplicates now back to O(1). New code is actually shorter than before. So, I added bucket, n/2. Implementation is trivial. Just set center of list, range(n), to 0. Code: def puzzle4(n):     lst = range(n)  # assume even n     lst[n//2] = 0   # skip center     return recurse4(lst, n) Then, recurse4() just skip pass center, if found. Code: def recurse4(lst, n, k=1, x=0):     if k==n: print x; return     if k+k==n: x=n*x+k; k+=1    # skip center     x, step = n*x, k+k     d1 = k-x%k  # first valid (mod k)     if k&1:     # odd k         if not (d1&1): d1 += k  # d1 also odd     elif k==2:         d1 = 4 if (n&3) else 2     else:       # even k >= 4         bad = (n+k-d1) & 3         if k&3: d1 += bad and k # d1 (mod4) = n+k         elif bad: return        # can't fix by +k         else: step = k          # step (mod4) = 0     for d in xrange(d1, n, step):         if not lst[d]: continue         lst[d] = 0              # lst[d] mark taken         recurse4(lst, n, k+1, x+d)         lst[d] = d              # restore >>> puzzle4(10) 381654729 Here is (lst, x) when first digit is 3: [0, 1, 2, 0, 4, 0, 6, 7, 8, 9] 3 [0, 1, 2, 0, 0, 0, 6, 7, 8, 9] 34 [0, 1, 2, 0, 4, 0, 6, 7, 0, 9] 38 [0, 0, 2, 0, 4, 0, 6, 7, 0, 9] 381 [0, 0, 0, 0, 4, 0, 6, 7, 0, 9] 3812 [0, 0, 2, 0, 4, 0, 0, 7, 0, 9] 3816 [0, 0, 2, 0, 0, 0, 0, 7, 0, 9] 381654 [0, 0, 2, 0, 0, 0, 0, 0, 0, 9] 3816547 [0, 0, 0, 0, 0, 0, 0, 0, 0, 9] 38165472 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0] 381654729 With this, I confirmed there is no solution for 16 ≤ n ≤ 44 Interestingly, it take less time to run n=42, than n=40 n﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ time (sec), on my Toshiba A665 laptop 32﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 2 34﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 8 36﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 11 38﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 55 40﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 114 42﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 110 44﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 843 ≈ 14 minutes RE: Puzzle - RPL and others - 3298 - 05-06-2021 04:24 PM Patching my last SysRPL program for the division-by-4 stuff discussed above: change these lines ... Code:       1GETLAM INDEX@       BEGIN ... to ... Code:       1GETLAM DUPTWO #AND       #0=SKIP #2*       INDEX@       BEGIN (full patched program: 415.5 bytes, #016Ah) 22 takes about 6.12_s now. What a speedup. 24 is obviously unchanged at 150_s (dunno what's taking so long there, perhaps it has to dig deep into the recursion before it can reject everything), 26 takes 10 seconds. 28 took ages with 3967_s. 30 is back down to 5.7_s, and 32 bounces up again (still running). (05-05-2021 06:29 PM)Albert Chan Wrote:  With this, I confirmed there is no solution for 16 ≤ n ≤ 44 Make that 16 ≤ n ≤ 56. I brought out the big guns too, rewrote my SysRPL program in C (code below) and ran it on my laptop, an XPS 13 9343 equipped with i5-5300U. Not exactly a speed demon, and the code is probably not perfect either, but it seems to do the job better than your script language. (Or is my full GCD partitioning worth that much? I doubt it, because there are still a few prime*2 numbers in there which it literally can't handle better than your simpler partitioning.) Up to 56 it took a handful of minutes per base at most. I aborted 58 after about two and a half hours, 60 went through in less than five seconds (highly composite number to the rescue! A perfect fit for GCD partitioning) with no results either. Haven't tried 62. Since this program needs to handle quite big integers (> 350 bits if my math is right), I implemented my own unsigned near-infinite-length integers. The routines to operate on them are just the bare minimum for this program (addition, multiplication, modulo, and division all between an infinite-length integer and a short one, creating 0, checking for 0, disposal). Bitsets are still only 64 bits (unsigned long long), which limits the base to < 66, but I expect those wouldn't take me much to move to infinite length too: only needs AND, OR, leftshift by short integer, NOT on specified width. Alternatively I could invert part of the logic (switching not_taken back to taken) and avoid the NOT operation where zero-extension is harmful. It would be replaced by a combined ANDNOT, i.e. an AND operation which inverts one parameter; the other one would still put an upper bound on the length of the result when all 1-bits have to be stored. Type sizes for the infinite-length integers are chosen such that 14 triggers all codepaths already (well, except errors of course); since the program returns the correct result on this case, it should have a good chance at being bug-free. Maybe performance could be improved by increasing these sizes and relying on the compiler's optimized code to handle operations on integers larger than the processor's wordsize. Code: #include  #include  #include  #include  #include  #include  typedef unsigned int singledata_t; typedef unsigned long long doubledata_t; static_assert (sizeof(singledata_t) * 2 == sizeof(doubledata_t)); typedef struct {     unsigned int len;     singledata_t data[]; } nearinflen_uint_t; #define NEARINFLEN_UINT_MALLOC(len) ((nearinflen_uint_t *) \     malloc(sizeof(nearinflen_uint_t) + sizeof(singledata_t) * (len))) #define NEARINFLEN_UINT_SHIFT (sizeof(singledata_t) * 8) static inline nearinflen_uint_t *nearinflen_uint_zero() {     nearinflen_uint_t *n = NEARINFLEN_UINT_MALLOC(1);     assert(n != NULL);     n->len = 1;     n->data[0] = 0;     return n; } static inline bool nearinflen_uint_iszero(nearinflen_uint_t *n) {     int i;     for (; ilen; ++i)         if (n->data[i] != 0) return false;     return true; } static inline void nearinflen_uint_release(nearinflen_uint_t *n) {     free(n); } static nearinflen_uint_t *nearinflen_uint_add(nearinflen_uint_t *n, singledata_t m) {     nearinflen_uint_t *res = NEARINFLEN_UINT_MALLOC(n->len + 1);     assert(res != NULL);     unsigned int i = 0;     doubledata_t carry = m;     for (; i < n->len && carry != 0; ++i) {         carry += n->data[i];         res->data[i] = (singledata_t) carry;         carry >>= NEARINFLEN_UINT_SHIFT;     }     if (carry != 0) {         res->data[n->len] = (singledata_t) carry;         res->len = n->len + 1;     } else {         res->len = n->len;         memcpy(res->data + i, n->data + i, sizeof(singledata_t) * (n->len - i));     }     return res; } static nearinflen_uint_t *nearinflen_uint_mul(nearinflen_uint_t *n, singledata_t m) {     nearinflen_uint_t *res = NEARINFLEN_UINT_MALLOC(n->len + 1);     assert(res != NULL);     unsigned int i = 0;     doubledata_t carry = 0;     for (; i < n->len; ++i) {         carry += ((doubledata_t) n->data[i]) * m;         res->data[i] = (singledata_t) carry;         carry >>= NEARINFLEN_UINT_SHIFT;     }     if (carry != 0) {         res->data[n->len] = (singledata_t) carry;         res->len = n->len + 1;     } else res->len = n->len;     return res; } static singledata_t nearinflen_uint_div2(nearinflen_uint_t *n, singledata_t m,         nearinflen_uint_t **res) {     if (res != NULL) {         *res = NEARINFLEN_UINT_MALLOC(n->len);         assert(*res != NULL);         memset((*res)->data, 0, sizeof(singledata_t) * n->len);     }     unsigned int i = n->len - 1;     doubledata_t rem = n->data[i];     if (m > rem) {         if (i == 0) {             if (res != NULL) (*res)->len = 1;             return n->data[0];         }         --i;         rem = (rem << NEARINFLEN_UINT_SHIFT) | n->data[i];     }     if (res != NULL) (*res)->len = i + 1;     doubledata_t m_shifted = m, bit = 1;     for (; rem >= m_shifted; bit <<= 1, m_shifted <<= 1);     while (bit != 1 || i != 0) {         if (bit == 1) {             bit <<= NEARINFLEN_UINT_SHIFT;             m_shifted <<= NEARINFLEN_UINT_SHIFT;             --i;             rem = (rem << NEARINFLEN_UINT_SHIFT) | n->data[i];         }         bit >>= 1;         m_shifted >>= 1;         if (rem >= m_shifted) {             rem -= m_shifted;             if (res != NULL) (*res)->data[i] |= bit;         }     }     return (singledata_t) rem; } typedef unsigned long long bitset_t; typedef struct output_digit_t output_digit_t; struct output_digit_t {     singledata_t digit;     output_digit_t *next; }; singledata_t base; bitset_t *valid_digits; static void try_digits(nearinflen_uint_t *n, singledata_t i, bitset_t not_taken) {     bitset_t valid = not_taken & valid_digits[i];     singledata_t digit = i - nearinflen_uint_div2(n, i, NULL);     for (; digit < base; digit += i) {         bitset_t digit_bitset = 1 << (digit - 1);         if ((valid & digit_bitset) == 0) continue;         nearinflen_uint_t *tmp = nearinflen_uint_add(n, digit);         if (i + 1 != base) {             nearinflen_uint_t *tmp2 = nearinflen_uint_mul(tmp, base);             nearinflen_uint_release(tmp);             try_digits(tmp2, i + 1, not_taken & ~digit_bitset);             continue;         }         /* this destroys digit and n, but we cannot have            two results that only differ in the last digit anyway,            so print and break out */         nearinflen_uint_release(n);         digit = nearinflen_uint_div2(tmp, base, &n);         bool last;         output_digit_t *out = NULL, *prev;         do {             last = nearinflen_uint_iszero(n);             prev = out;             out = malloc(sizeof(output_digit_t));             assert(out != NULL);             out->next = prev;             out->digit = digit;             nearinflen_uint_release(tmp);             tmp = n;             digit = nearinflen_uint_div2(tmp, base, &n);         } while (!last);         nearinflen_uint_release(tmp);         do {             last = out->next == NULL;             printf("%u%c", out->digit, last ? '\n' : ';');             prev = out;             out = out->next;             free(prev);         } while (!last);         break;     }     nearinflen_uint_release(n); } static inline singledata_t gcd(singledata_t big, singledata_t small) {     do {         singledata_t tmp = big % small;         big = small;         small = tmp;     } while (small != 0);     return big; } int main(int argc, char **argv) {     if (argc < 2) {         printf("Usage: %s \n", argv[0]);         return 1;     }     base = (singledata_t) atoi(argv[1]);     if (base < 2 || base > 65) {         printf("Invalid input, must be an integer between 2 and 65\n");         return 1;     }     if ((base & 1) != 0) return 0;     valid_digits = calloc(base, sizeof(bitset_t));     assert(valid_digits != NULL);     singledata_t *gcd_by_pos = calloc(base, sizeof(singledata_t));     assert(gcd_by_pos != NULL);     singledata_t i = 1;     bitset_t bit = 1;     for (; i < base; ++i, bit <<= 1) {         singledata_t tmp = gcd(i, base & 2 != 0 ? base * 2 : base);         gcd_by_pos[i] = tmp;         valid_digits[tmp] |= bit;     }     for (i = 1; i < base; ++i) valid_digits[i] = valid_digits[gcd_by_pos[i]];     free(gcd_by_pos);     nearinflen_uint_t *zero = nearinflen_uint_zero();     try_digits(zero, 1, ~((bitset_t) 0));     free(valid_digits);     return 0; } RE: Puzzle - RPL and others - Albert Chan - 05-06-2021 09:09 PM (05-05-2021 06:29 PM)Albert Chan Wrote:  With this, I confirmed there is no solution for 16 ≤ n ≤ 44 Interestingly, it take less time to run n=42, than n=40 I found out the reason n=42 take less time than n=40. Think non-buckets. Example, first digit does not belong to any buckets. >>> n = 10 # = 2 * 5 >>> sorted(set(range(1,n)) - set(range(2,n,2)) - set(range(5,n,5))) [1, 3, 7, 9] >>> n = 40 # = 2**3 * 5 >>> sorted(set(range(1,n)) - set(range(2,n,2)) - set(range(5,n,5))) [1, 3, 7, 9, 11, 13, 17, 19, 21, 23, 27, 29, 31, 33, 37, 39] >>> n = 42 # = 2 * 3 * 7 >>> sorted(set(range(1,n)) - set(range(2,n,2)) - set(range(3,n,3)) - set(range(7,n,7))) [1, 5, 11, 13, 17, 19, 23, 25, 29, 31, 37, 41] This is the set of digits for d1, and n=42 has less elements than n=40. (looping anything else is just a waste of time) And, this set applied not just d1, but any dk that is coprime to n. Conversely, if k is not coprime to n, dk must not be above sets of values. Combined, we have invariant: coprime(n,k) = coprime(n,dk) Code: from euclid import gcd def puzzle5(n):     lst = range(n)  # assume even n     lst[n//2] = 0   # skip center     return recurse5(lst + [gcd(n,d)==1 for d in lst], n) def recurse5(lst, n, k=1, x=0):     if k==n: print x; return     if k+k==n: x=n*x+k; k+=1    # skip center     x, step = n*x, k+k     d1 = k-x%k  # first valid (mod k)     if k&1:     # odd k         if not (d1&1): d1 += k  # d1 also odd     elif k==2:         d1 = 4 if (n&3) else 2     else:       # even k >= 4         bad = (n+k-d1) & 3         if k&3: d1 += bad and k # d1 (mod4) = n+k         elif bad: return        # can't fix by +k         else: step = k          # step (mod4) = 0     for d in xrange(d1, n, step):         if not lst[d]: continue         if lst[n+d] != lst[n+k]: continue         lst[d] = 0              # mark taken         recurse5(lst, n, k+1, x+d)         lst[d] = d              # restore I doubled up lst for 2 purpose. (lst combined to reduce stress of arguments passing) First half is to check duplicates, like before. Second half holds coprime(n,d) table: lst[n+d] ≡ coprime(n,d) Coprime test benefited if base n has many different factors. Code:        time (sec), on my Toshiba A665 laptop n      puzzle4(n)  puzzle5(n) 32﻿     2           2 34﻿     8           9 36﻿     11          3 38﻿     55          60 40﻿     114         46 42﻿     110         31 44﻿     843         590 RE: Puzzle - RPL and others - Albert Chan - 05-07-2021 02:56 AM Previous post Python code, translated to Lua. Since Lua does not have infinite precision integer built-in, I just loop the digits for MOD. As expected, speed improved quite a bit, about 5X. (LuaJIT 1.1.8) I have confirmed n=60 has no solution ... in an hour. Code: do local gcd, recurse5 = require 'factor'.gcd function puzzle5(n)     local lst = {}     for i=1,n do lst[i] = i end     for i=1,n do lst[n+i] = (gcd(n,i) == 1) end     lst[floor(n/2)] = false     -- skip center     return recurse5(lst, n, 1, {}) end function recurse5(lst, n, k, X)        if k==n then return print(table.concat(X,',')) end     if k+k==n then X[k]=k; k=k+1 end -- skip center     local d, step = 0, k+k     for i=1,k-1 do d = (d*n + X[i]) % k end     d = k - d*n%k               -- first valid (mod k)     if k%2~=0 then              -- k is odd -> d odd too         if d%2==0 then d = d+k end     elseif k==2 then            -- mod4 restricted         d = (n%4~=0) and 4 or 2     else         local bad = ((n+k-d)%4 ~= 0)         if k%4 ~= 0 then if bad then d=d+k end         elseif bad  then return -- can't fix by +k         else step = k           -- step = 0 (mod4)         end            end     while d < n do         if lst[d] and lst[n+d] == lst[n+k] then             X[k] = d                         lst[d] = false      -- mark taken                  recurse5(lst, n, k+1, X)             lst[d] = d          -- restore         end         d = d + step     end end end lua> puzzle5(8) 3,2,5,4,1,6,7 5,2,3,4,7,6,1 5,6,7,4,3,2,1 lua> puzzle5(10) 3,8,1,6,5,4,7,2,9 lua> puzzle5(14) 9,12,3,10,5,4,7,6,11,8,1,2,13 RE: Puzzle - RPL and others - Albert Chan - 05-07-2021 10:35 AM (05-06-2021 09:09 PM)Albert Chan Wrote:  Combined, we have invariant: coprime(n,k) = coprime(n,dk) I was being stupid ... test for coprime is same as test for gcd = 1 And, for gcd ≠ 1, it is the buckets that we are after. Since test for bucket n/2 is same as matching gcd = n/2, we can remove it. Hopefully, this is my final version, by removing stuff from recurse5() / puzzle5(). Code: do local gcd, recurse = require 'factor'.gcd function puzzle(n)     if n%2 ~= 0 then return end -- n must be even     local lst = {}     for i=1,n do lst[i] = i end     for i=1,n do lst[n+i] = gcd(n,i) end     return recurse(lst, n, 1, {}) end function recurse(lst, n, k, X)     if k==n then return print('{'..table.concat(X,',')..'}') end     local d, step, s = 0, k+k, k%4     for i=1, k-6, 6 do          -- assume n <= 190         d = ((((((d*n + X[i])*n + X[i+1])*n + X[i+2])*n                     + X[i+3])*n + X[i+4])*n + X[i+5]) % k     end     for i = k-(k-1)%6, k-1 do d = d*n + X[i] end             d = k - d*n%k               -- first valid (mod k)     if s == 0 then              -- k = step = 0 (mod 4)         step = k     elseif s ~= 2 then          -- k = d = 1 (mod 2)         if d%2==0 then d=d+k end     elseif (n+d)%4 == 0 then    -- n+d+k = 0 (mod 4)         d = d+k     end          s = lst[n+k]                -- = gcd(n,k)     while d < n do         if lst[d] and lst[n+d] == s then             X[k] = d             lst[d] = false      -- mark taken             recurse(lst, n, k+1, X)             lst[d] = d          -- restore         end         d = d + step     end end end (05-07-2021 02:56 AM)Albert Chan Wrote:  I have confirmed n=60 has no solution ... in an hour. Updated version finished in 4½ minutes. Update1: add a simple test to quit if n is odd. Update2: unroll loops to reduce expensive mod's, this finish n=60 case in 3 minutes. The downside is n^7 ≤ 2^53 ⇒ n ≤ 190, which should be plenty enough. RE: Puzzle - RPL and others - 3298 - 05-07-2021 04:17 PM (05-07-2021 10:35 AM)Albert Chan Wrote:  Since test for bucket n/2 is same as matching gcd = n/2, we can remove it. Hopefully, this is my final version, by removing stuff from recurse5() / puzzle5(). Nah, there's more to remove. Odd/even is covered by whether the GCD is a multiple of 2 or not. So the odd/even test can go. As hinted at in my answer to your private message, as well as in my previous code, the mod4 test on non-mod4 bases can be worked into the GCD partitioning as well. And as I found out after writing the code, it can also be generalized to higher powers of 2, more specifically to the smallest power of 2 that doesn't divide the base anymore. I'm doubling the base for the purpose of GCD calls - that covers this mod4 case and similar cases with higher powers of 2, but it needs an adjustment after creating the buckets: swap the mod4 one (or whatever power of 2) with the mod2 one (or next-lower power of 2 in the general case), that accounts for the offset of half the mod between qualifying digits and the mod. Exception: if the base is a power of 2 itself, all GCDs are powers of 2 as well, and the additional bucket is outside of our range of digits, so no swapping is needed. The next-lower power of 2 would also be the base itself, which isn't a valid bucket either because the digit range goes to just 1 below that, so depending on how your languages handles out-of-bounds accesses, you could just swap anyway. To illustrate: - On base 10: - - creating buckets with GCD(20, i): 1->{1, 3, 7, 9}, 2->{2, 6}, 4->{4, 8}, 5->{5} - - highest power of 2 dividing the base is 2, so this extra bucket is mod4 -> the buckets to swap are 2 and 4 - - result: digits allowed in positions {1, 3, 7, 9} are {1, 3, 7, 9}, digits allowed in positions {2, 6} are {4, 8}, digits allowed in positions {4, 8} are {2, 6}, and position 5 has to be 5, of course. - On base 12: - - creating buckets with GCD(24, i): 1->{1, 5, 7, 11}, 2->{2, 6, 10}, 4->{4}, 6->{6}, 8->{8} - - highest power of 2 dividing the base is 4, so this extra bucket is mod8 -> the buckets to swap are 4 and 8 - - result: digits allowed in positions {1, 5, 7, 11} are {1, 5, 7, 11}, digits allowed in positions {2, 6, 10} are {2, 6, 10}, position 4 has to be 8, position 6 has to be 6, and position 8 has to be 4. - On base 8: - - creating buckets with GCD(16, i): 1->{1, 3, 5, 7}, 2->{2, 6}, 4->{4} (note: these are the same buckets as with GCD(8, i), doubling the base has no effect on powers of 2) - - highest power of 2 dividing the base is 8, so this extra bucket would be mod16 -> don't swap to avoid index-out-of-bounds - - result: digits allowed in positions {1, 3, 5, 7} are {1, 3, 5, 7}, digits allowed in positions {2, 6} are {2, 6}, and position 4 has to be 4. --- The mentioned private message was an inquiry regarding what turned out to be bugs in my programs. The first is a simple uninitialized variable in nearinflen_uint_iszero, where the declaration of i needs to be turned into a definition to 0. (Why didn't GCC warn me about use of an uninitialized variable?) The function is only used in printing anyway, so the calculations wouldn't be affected by the bug. Either way, it's fixed in the code listing below. The more serious bug is an incorrectly patched-in mod4 detection, which resulted in no solutions found on bases 6, 10, and 14. (Silly me for not testing the patch sufficiently!) This bug was actually present not only in the C version but also in the patched SysRPL one, since they both use the same algorithm. The explanation above is for the corrected version; they previously only doubled up the base, and did so only if it wasn't divisible by 4. The generalization removes this divisibility-by-4 check, and the bugfix swaps the buckets. Corrected code for both languages follows. Code: #include  #include  #include  #include  #include  #include  typedef unsigned int singledata_t; typedef unsigned long long doubledata_t; static_assert (sizeof(singledata_t) * 2 == sizeof(doubledata_t)); typedef struct {     unsigned int len;     singledata_t data[]; } nearinflen_uint_t; #define NEARINFLEN_UINT_MALLOC(len) ((nearinflen_uint_t *) \     malloc(sizeof(nearinflen_uint_t) + sizeof(singledata_t) * (len))) #define NEARINFLEN_UINT_SHIFT (sizeof(singledata_t) * 8) static inline nearinflen_uint_t *nearinflen_uint_zero() {     nearinflen_uint_t *n = NEARINFLEN_UINT_MALLOC(1);     assert(n != NULL);     n->len = 1;     n->data[0] = 0;     return n; } static inline bool nearinflen_uint_iszero(nearinflen_uint_t *n) {     int i = 0;     for (; i < n->len; ++i)         if (n->data[i] != 0) return false;     return true; } static inline void nearinflen_uint_release(nearinflen_uint_t *n) {     free(n); } static nearinflen_uint_t *nearinflen_uint_add(nearinflen_uint_t *n, singledata_t m) {     nearinflen_uint_t *res = NEARINFLEN_UINT_MALLOC(n->len + 1);     assert(res != NULL);     unsigned int i = 0;     doubledata_t carry = m;     for (; i < n->len && carry != 0; ++i) {         carry += n->data[i];         res->data[i] = (singledata_t) carry;         carry >>= NEARINFLEN_UINT_SHIFT;     }     if (carry != 0) {         res->data[n->len] = (singledata_t) carry;         res->len = n->len + 1;     } else {         res->len = n->len;         memcpy(res->data + i, n->data + i, sizeof(singledata_t) * (n->len - i));     }     return res; } static nearinflen_uint_t *nearinflen_uint_mul(nearinflen_uint_t *n, singledata_t m) {     nearinflen_uint_t *res = NEARINFLEN_UINT_MALLOC(n->len + 1);     assert(res != NULL);     unsigned int i = 0;     doubledata_t carry = 0;     for (; i < n->len; ++i) {         carry += ((doubledata_t) n->data[i]) * m;         res->data[i] = (singledata_t) carry;         carry >>= NEARINFLEN_UINT_SHIFT;     }     if (carry != 0) {         res->data[n->len] = (singledata_t) carry;         res->len = n->len + 1;     } else res->len = n->len;     return res; } static singledata_t nearinflen_uint_div2(nearinflen_uint_t *n, singledata_t m,         nearinflen_uint_t **res) {     if (res != NULL) {         *res = NEARINFLEN_UINT_MALLOC(n->len);         assert(*res != NULL);         memset((*res)->data, 0, sizeof(singledata_t) * n->len);     }     unsigned int i = n->len - 1;     doubledata_t rem = n->data[i];     if (m > rem) {         if (i == 0) {             if (res != NULL) (*res)->len = 1;             return n->data[0];         }         --i;         rem = (rem << NEARINFLEN_UINT_SHIFT) | n->data[i];     }     if (res != NULL) (*res)->len = i + 1;     doubledata_t m_shifted = m, bit = 1;     for (; rem >= m_shifted; bit <<= 1, m_shifted <<= 1);     while (bit != 1 || i != 0) {         if (bit == 1) {             bit <<= NEARINFLEN_UINT_SHIFT;             m_shifted <<= NEARINFLEN_UINT_SHIFT;             --i;             rem = (rem << NEARINFLEN_UINT_SHIFT) | n->data[i];         }         bit >>= 1;         m_shifted >>= 1;         if (rem >= m_shifted) {             rem -= m_shifted;             if (res != NULL) (*res)->data[i] |= bit;         }     }     return (singledata_t) rem; } typedef unsigned long long bitset_t; typedef struct output_digit_t output_digit_t; struct output_digit_t {     singledata_t digit;     output_digit_t *next; }; singledata_t base; bitset_t *valid_digits; static void try_digits(nearinflen_uint_t *n, singledata_t i, bitset_t not_taken) {     bitset_t valid = not_taken & valid_digits[i];     singledata_t digit = i - nearinflen_uint_div2(n, i, NULL);     for (; digit < base; digit += i) {         bitset_t digit_bitset = 1 << (digit - 1);         if ((valid & digit_bitset) == 0) continue;         nearinflen_uint_t *tmp = nearinflen_uint_add(n, digit);         if (i + 1 != base) {             nearinflen_uint_t *tmp2 = nearinflen_uint_mul(tmp, base);             nearinflen_uint_release(tmp);             try_digits(tmp2, i + 1, not_taken & ~digit_bitset);             continue;         }         /* this destroys digit and n, but we cannot have            two results that only differ in the last digit anyway,            so print and break out */         nearinflen_uint_release(n);         digit = nearinflen_uint_div2(tmp, base, &n);         bool last;         output_digit_t *out = NULL, *prev;         do {             last = nearinflen_uint_iszero(n);             prev = out;             out = malloc(sizeof(output_digit_t));             assert(out != NULL);             out->next = prev;             out->digit = digit;             nearinflen_uint_release(tmp);             tmp = n;             digit = nearinflen_uint_div2(tmp, base, &n);         } while (!last);         nearinflen_uint_release(tmp);         do {             last = out->next == NULL;             printf("%u%c", out->digit, last ? '\n' : ';');             prev = out;             out = out->next;             free(prev);         } while (!last);         break;     }     nearinflen_uint_release(n); } static inline singledata_t gcd(singledata_t big, singledata_t small) {     do {         singledata_t tmp = big % small;         big = small;         small = tmp;     } while (small != 0);     return big; } int main(int argc, char **argv) {     if (argc < 2) {         printf("Usage: %s \n", argv[0]);         return 1;     }     base = (singledata_t) atoi(argv[1]);     if (base < 2 || base > 65) {         printf("Invalid input, must be an integer between 2 and 65\n");         return 1;     }     if ((base & 1) != 0) return 0;     valid_digits = calloc(base, sizeof(bitset_t));     assert(valid_digits != NULL);     singledata_t *gcd_by_pos = calloc(base, sizeof(singledata_t));     assert(gcd_by_pos != NULL);     singledata_t i = 1;     bitset_t bit = 1;     for (; i < base; ++i, bit <<= 1) {         singledata_t tmp = gcd(i, base * 2);         gcd_by_pos[i] = tmp;         valid_digits[tmp] |= bit;     }     singledata_t highest_2power_gcd = 1;     while (base % highest_2power_gcd == 0) highest_2power_gcd *= 2;     if (highest_2power_gcd < base) {         bitset_t tmp = valid_digits[highest_2power_gcd];         valid_digits[highest_2power_gcd] = valid_digits[highest_2power_gcd / 2];         valid_digits[highest_2power_gcd / 2] = tmp;     }     for (i = 1; i < base; ++i) valid_digits[i] = valid_digits[gcd_by_pos[i]];     free(gcd_by_pos);     nearinflen_uint_t *zero = nearinflen_uint_zero();     try_digits(zero, 1, ~((bitset_t) 0));     free(valid_digits);     return 0; } Code: ::   CK1NoBlame FPTR2 ^CK1Z   DUP FPTR2 ^Z># DUP #2-   BINT63 #> caseSIZEERR   DUPONE #AND #0<> case2DROP   WORDSIZE 1LAMBIND   ERRSET ::     BINT64 dostws BINT1 #>HXS     OVER ONE_DO       DUP ONE{}N 4UNROLL       DUP4UNROLL bitSL     LOOP     DROP' ::       OVER #2* #3+ GETLAM       4PICK bitAND UNROT       3GETLAM FPTR2 ^QMul 1GETLAM       3PICK 3PICKOVER FPTR2 ^#>Z       FPTR2 ^Mod FPTR2 ^Z># #-       DO         INDEX@ #2* #2+ GETLAM         4PICKOVER bitAND         OVER HXS==HXS %0=         ITE_DROP ::           bitNOT 5PICK bitAND           3PICK#1+_ 3PICK INDEX@           FPTR2 ^#>Z FPTR2 ^QAdd           OVER 1GETLAM #<>case           2GETEVAL           ROTDROPSWAP           #2* #2* #3- UNROLL         ;       OVER +LOOP       4DROP     ;     SWAP' NULLLAM OVER #2* #1+     NDUPN DOBIND     1GETLAM ONE_DO       1GETLAM #2* INDEX@       BEGIN         SWAPOVER #/ DROP       #0=UNTIL       DROP #2* #3+ DUP GETLAM       INNERCOMP INDEX@ ROTOVER       #2* #2+ GETLAM bitOR       ROT#1+ {}N SWAP PUTLAM     LOOP     BINT1     BEGIN       #2*       1GETLAM OVER #/ DROP     #0<> UNTIL     1GETLAM OVER#<     ITE_DROP ::       DUP #2* #3+       SWAPOVER GETLAM INNERCOMP       get1 #3+       DUP GETLAM INNERDUP       #4+PICK SWAPROT       OVER #4+ UNPICK_       {}N SWAP PUTLAM       {}N SWAP PUTLAM     ;     1GETLAM ONE_DO       INDEX@ #2* #3+ GETLAM       DUPTYPEHSTR?       ITE_DROP ::         INNERCOMP ONE_DO           DUPROT #2* #3+ PUTLAM         LOOP         DROP       ;     LOOP     BINT0 #>HXS bitNOT BINT1 Z0_     2GETEVAL     ABND   ; ERRTRAP ::     1GETLAM dostws ERRJMP   ;   1GETABND dostws ; Of course the bug invalidates the timing and results I got from that iteration of the algorithm. 24 now takes 111.7_s on the 50g, 22 takes 379.3_s. On the laptop, 60 is still rescued by the large number of small buckets and processes in about a second. Out of the numbers below 60, only 58 takes a significant amount of time (still running after an hour away). I don't have accurate measurements, but the other ones are done in less than five minutes, often significantly less. Edit: another hour and a half later, I noticed the 58 run must have finished while I wasn't looking. With that, we're looking at no solutions on bases from 16 up to 60. RE: Puzzle - RPL and others - Albert Chan - 05-09-2021 01:21 AM (05-07-2021 04:17 PM)3298 Wrote:  On the laptop, 60 is still rescued by the large number of small buckets and processes in about a second. The idea of getting buckets from gcd(2n,d) does not have much to back it up. Buckets from gcd(n, d) does, based on successive removal from set of digits {1 to n-1} This is how we get the non-buckets. (removal of all multiples, i.e. coprime to n) Mod-4 rule is also easy to show. We only need 2 digits. If n = 4k: d3*(4k) + d4 = 0 (mod 4) d4 = 0 (mod 4) ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ → d4 = d8 = d12 = ... = 0 (mod 4) ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ → d2 = d6 = d10 = ... = 2 (mod 4) If n = 4k+2: d3*(4k+2) + d4 = 0 (mod 4) 2*d3 + d4 = 0 (mod 4) d4/2 = -d3 = 1 (mod 2) d4 = 2 (mod 4) ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ → d4 = d8 = d12 = ... = 2 (mod 4) ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ → d2 = d6 = d10 = ... = 0 (mod 4) This generalized to invariant (even n, even k): n + k + dk ≡ 0 (mod 4) --- I tried to see the difference of gcd(n,d) vs gcd(2n,d), and this affect more than 1 bucket. Code: from gmpy2 import gcd def bucket(n, k=1):     b = {}     for i,x in enumerate(gcd(n*k,d) for d in range(n)):             b.setdefault(x, []).append(i)     for i in sorted(b)[:-1]: print i, b[i] >>> bucket(60) 1 [1, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 49, 53, 59] 2 [2, 14, 22, 26, 34, 38, 46, 58] 3 [3, 9, 21, 27, 33, 39, 51, 57] 4 [4, 8, 16, 28, 32, 44, 52, 56] 5 [5, 25, 35, 55] 6 [6, 18, 42, 54] 10 [10, 50] 12 [12, 24, 36, 48] 15 [15, 45] 20 [20, 40] 30 [30] >>> bucket(60,2) 1 [1, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 49, 53, 59] 2 [2, 14, 22, 26, 34, 38, 46, 58] 3 [3, 9, 21, 27, 33, 39, 51, 57] 4 [4, 28, 44, 52] 5 [5, 25, 35, 55] 6 [6, 18, 42, 54] 8 [8, 16, 32, 56] 10 [10, 50] 12 [12, 36] 15 [15, 45] 20 [20] 24 [24, 48] 30 [30] 40 [40] We have 3 splits, how to apply the rules ? Bucket 4 had splitted to 4 + 8 Bucket 12 had splitted to 12 + 24 Bucket 20 splitted to 20 + 40 And more importantly, is it even correct to "swap" buckets ? You might need to show this is always true, before adding it to the code ... RE: Puzzle - RPL and others - 3298 - 05-09-2021 01:39 PM Ugh. I'm getting more and more forgetful... splitting more than a single bucket was the purpose of weaving this into the GCD routine. But as it turns out, all of these buckets need swapping of their respective halves. Sooo... congratulations, you found another bug in my code! (05-09-2021 01:21 AM)Albert Chan Wrote:  The idea of getting buckets from gcd(2n,d) does not have much to back it up. What this is supposed to achieve is the mod4 splitting you're doing in the recursive function, just generalized a bit and ahead of time so as not to bog the recursive function down. (It's enough of a hot-spot, I can tell as much without a profiler.) On 60 this splitting is upgraded into mod8 instead of mod4, because 60 is divisible by 4 but not by 8. Let me show you how it's supposed to work. I'll use 30 for a direct comparison, where it's just the mod4 rule you developed. GCD(n,d) buckets are: - 1: {1 7 11 13 17 19 23 29} - 2: {2 4 8 14 16 22 26 28} - 3: {3 9 21 27} - 6: {6 12 18 24} - 5: {5 25} - 10: {10 20} - 15: {15} - 30: {} (empty because it's the base) I've deliberately sorted these not in increasing order of GCD, but by presence of prime factors of 30 in each GCD (top level sort: prime factor 5 present only in the last four; then prime factor 3, and finally 2). I've also listed an empty bucket for 30, because technically that can be assembled from the prime factors of 30 too, and it makes the sorting by prime factors cleaner. GCD(2n,d) buckets are (new buckets from doubling n are indented further; sorting is the same): - 1: {1 7 11 13 17 19 23 29} - 2: {2 14 22 26} - - 4: {4 8 16 28} - 3: {3 9 21 27} - 6: {6 18} - - 12: {12 24} - 5: {5 25} - 10: {10} - - 20: {20} - 15: {15} - 30: {} (empty because it's the base) - - 60: {} (also empty because it's larger than the base) 30 is a 4k+2 base, so d4, d8, d12, d16, d20, d24, d28 need to be = 2 (mod 4), i.e. 2, 6, 10, 14, 18, 22, 26. That's exactly what's left in the buckets 2, 6, 10 together after splitting them up by doubling n. d2, d6, d10, d14, d18, d22, d26 need to be = 0 (mod4), i.e. 4, 8, 12, 16, 20, 24, which is what's in the new buckets 4, 12, 20. Voila, on n = 30 doing GCD(2n, d) nets us the same split that comes from the mod4 rule. However, as you can easily see, on these split buckets the contents of each corresponds to the digit indices of the other one, instead of its own. That is compensated by swapping them after filling, so I can later simply look them up using the recursion depth counter. (There would be an extra indirection through the list of GCDs by digit, but my code flattens that away too before entering the recursive function.) Example for that: when looping through the possibilities on the 18th digit (d18 in your nomenclature), I pull up bucket GCD(2n, 18) which would normally be bucket 6, but swapping makes me hit the bucket I generated for 12, which contains {12 24}. Those are the digits that actually fit for d18, because they were in the bucket 6 on the old GCD(n, d) table and match the mod4 rule. --- The generalization to mod8 for 8k+4, mod16 for 16k+8, etc. uses the same principle that these necessarily even digits must be preceded by an odd digit. Let's use the variable p to denote the power of 2 that still divides the base, so that makes the generalization a (mod 2p)-rule for the two variants (2pk) and (2pk+p). Adapting your equations, because you can write this stuff up better than me (I hope I didn't slip up somewhere in these equations): If n = 2pk: $$d_{2p-1}\cdot(2pk) + d_{2p} = 0 \pmod{2p}$$ $$d_{2p} = 0 \pmod{2p}$$ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ → $$d_{2p} = d_{4p} = d_{6p} = ... = 0 \pmod{2p}$$ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ → $$d_{p} = d_{3p} = d_{5p} = ... = p \pmod{2p}$$ That part was uninteresting because clearly n is divisible by 2p, so we got a GCD bucket for 2p anyway, even without doubling the base thrown into the GCD algorithm. If n = 2pk+p: $$d_{2p-1}\cdot(2pk+p) + d_{2p} = 0 \pmod{2p}$$ $$2\cdot d_{2p-1} + d_{2p} = 0 \pmod{2p}$$ $$d_{2p}/2 = -d_{2p-1} = p/2 \pmod{p}$$ $$d_{2p} = p \pmod{2p}$$ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ → $$d_{2p} = d_{4p} = d_{6p} = ... = p \pmod{2p}$$ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ → $$d_{p} = d_{3p} = d_{5p} = ... = 0 \pmod{2p}$$ This is the part that is captured by doing GCD(2n, d) instead of GCD(n, d). We throw one additional 2 into the pot of prime factors available to build GCDs out of, and that splits the p bucket into p and 2p in line with this rule, as well as 3p into 3p and 6p, 5p into 5p and 10p, etc. I just need to swap all of these split buckets, not only p and 2p as the code in my previous post does, but also e.g. 3p and 6p. Solid theoretical foundation (I think), sloppy practical implementation. It's not helpful that our only found solutions for this puzzle are on such low bases, because they don't trigger the bug, and on the higher ones there are no solutions that could be skipped due to it. Edit: bugfix was simple, just loop over multiples of highest_2power_gcd, and check if there's actually a bucket at that index. Code: #include  #include  #include  #include  #include  #include  typedef unsigned int singledata_t; typedef unsigned long long doubledata_t; static_assert (sizeof(singledata_t) * 2 == sizeof(doubledata_t)); typedef struct {     unsigned int len;     singledata_t data[]; } nearinflen_uint_t; #define NEARINFLEN_UINT_MALLOC(len) ((nearinflen_uint_t *) \     malloc(sizeof(nearinflen_uint_t) + sizeof(singledata_t) * (len))) #define NEARINFLEN_UINT_SHIFT (sizeof(singledata_t) * 8) static inline nearinflen_uint_t *nearinflen_uint_zero() {     nearinflen_uint_t *n = NEARINFLEN_UINT_MALLOC(1);     assert(n != NULL);     n->len = 1;     n->data[0] = 0;     return n; } static inline bool nearinflen_uint_iszero(nearinflen_uint_t *n) {     int i = 0;     for (; i < n->len; ++i)         if (n->data[i] != 0) return false;     return true; } static inline void nearinflen_uint_release(nearinflen_uint_t *n) {     free(n); } static nearinflen_uint_t *nearinflen_uint_add(nearinflen_uint_t *n, singledata_t m) {     nearinflen_uint_t *res = NEARINFLEN_UINT_MALLOC(n->len + 1);     assert(res != NULL);     unsigned int i = 0;     doubledata_t carry = m;     for (; i < n->len && carry != 0; ++i) {         carry += n->data[i];         res->data[i] = (singledata_t) carry;         carry >>= NEARINFLEN_UINT_SHIFT;     }     if (carry != 0) {         res->data[n->len] = (singledata_t) carry;         res->len = n->len + 1;     } else {         res->len = n->len;         memcpy(res->data + i, n->data + i, sizeof(singledata_t) * (n->len - i));     }     return res; } static nearinflen_uint_t *nearinflen_uint_mul(nearinflen_uint_t *n, singledata_t m) {     nearinflen_uint_t *res = NEARINFLEN_UINT_MALLOC(n->len + 1);     assert(res != NULL);     unsigned int i = 0;     doubledata_t carry = 0;     for (; i < n->len; ++i) {         carry += ((doubledata_t) n->data[i]) * m;         res->data[i] = (singledata_t) carry;         carry >>= NEARINFLEN_UINT_SHIFT;     }     if (carry != 0) {         res->data[n->len] = (singledata_t) carry;         res->len = n->len + 1;     } else res->len = n->len;     return res; } static singledata_t nearinflen_uint_div2(nearinflen_uint_t *n, singledata_t m,         nearinflen_uint_t **res) {     if (res != NULL) {         *res = NEARINFLEN_UINT_MALLOC(n->len);         assert(*res != NULL);         memset((*res)->data, 0, sizeof(singledata_t) * n->len);     }     unsigned int i = n->len - 1;     doubledata_t rem = n->data[i];     if (m > rem) {         if (i == 0) {             if (res != NULL) (*res)->len = 1;             return n->data[0];         }         --i;         rem = (rem << NEARINFLEN_UINT_SHIFT) | n->data[i];     }     if (res != NULL) (*res)->len = i + 1;     doubledata_t m_shifted = m, bit = 1;     for (; rem >= m_shifted; bit <<= 1, m_shifted <<= 1);     while (bit != 1 || i != 0) {         if (bit == 1) {             bit <<= NEARINFLEN_UINT_SHIFT;             m_shifted <<= NEARINFLEN_UINT_SHIFT;             --i;             rem = (rem << NEARINFLEN_UINT_SHIFT) | n->data[i];         }         bit >>= 1;         m_shifted >>= 1;         if (rem >= m_shifted) {             rem -= m_shifted;             if (res != NULL) (*res)->data[i] |= bit;         }     }     return (singledata_t) rem; } typedef unsigned long long bitset_t; typedef struct output_digit_t output_digit_t; struct output_digit_t {     singledata_t digit;     output_digit_t *next; }; singledata_t base; bitset_t *valid_digits; static void try_digits(nearinflen_uint_t *n, singledata_t i, bitset_t not_taken) {     bitset_t valid = not_taken & valid_digits[i];     singledata_t digit = i - nearinflen_uint_div2(n, i, NULL);     for (; digit < base; digit += i) {         bitset_t digit_bitset = 1 << (digit - 1);         if ((valid & digit_bitset) == 0) continue;         nearinflen_uint_t *tmp = nearinflen_uint_add(n, digit);         if (i + 1 != base) {             nearinflen_uint_t *tmp2 = nearinflen_uint_mul(tmp, base);             nearinflen_uint_release(tmp);             try_digits(tmp2, i + 1, not_taken & ~digit_bitset);             continue;         }         /* this destroys digit and n, but we cannot have            two results that only differ in the last digit anyway,            so print and break out */         nearinflen_uint_release(n);         digit = nearinflen_uint_div2(tmp, base, &n);         bool last;         output_digit_t *out = NULL, *prev;         do {             last = nearinflen_uint_iszero(n);             prev = out;             out = malloc(sizeof(output_digit_t));             assert(out != NULL);             out->next = prev;             out->digit = digit;             nearinflen_uint_release(tmp);             tmp = n;             digit = nearinflen_uint_div2(tmp, base, &n);         } while (!last);         nearinflen_uint_release(tmp);         do {             last = out->next == NULL;             printf("%u%c", out->digit, last ? '\n' : ';');             prev = out;             out = out->next;             free(prev);         } while (!last);         break;     }     nearinflen_uint_release(n); } static inline singledata_t gcd(singledata_t big, singledata_t small) {     do {         singledata_t tmp = big % small;         big = small;         small = tmp;     } while (small != 0);     return big; } int main(int argc, char **argv) {     if (argc < 2) {         printf("Usage: %s \n", argv[0]);         return 1;     }     base = (singledata_t) atoi(argv[1]);     if (base < 2 || base > 65) {         printf("Invalid input, must be an integer between 2 and 65\n");         return 1;     }     if ((base & 1) != 0) return 0;     valid_digits = calloc(base, sizeof(bitset_t));     assert(valid_digits != NULL);     singledata_t *gcd_by_pos = calloc(base, sizeof(singledata_t));     assert(gcd_by_pos != NULL);     singledata_t i = 1;     bitset_t bit = 1;     for (; i < base; ++i, bit <<= 1) {         singledata_t tmp = gcd(i, base * 2);         gcd_by_pos[i] = tmp;         valid_digits[tmp] |= bit;     }     singledata_t highest_2power_gcd = 1;     while (base % highest_2power_gcd == 0) highest_2power_gcd *= 2;     for (i = highest_2power_gcd; i < base; i += highest_2power_gcd) {         bitset_t tmp = valid_digits[i];         if (tmp != 0) {             valid_digits[i] = valid_digits[i / 2];             valid_digits[i / 2] = tmp;         }     }     for (i = 1; i < base; ++i) valid_digits[i] = valid_digits[gcd_by_pos[i]];     free(gcd_by_pos);     nearinflen_uint_t *zero = nearinflen_uint_zero();     try_digits(zero, 1, ~((bitset_t) 0));     free(valid_digits);     return 0; } Code: ::   CK1NoBlame FPTR2 ^CK1Z   DUP FPTR2 ^Z># DUP #2-   BINT63 #> caseSIZEERR   DUPONE #AND #0<> case2DROP   WORDSIZE 1LAMBIND   ERRSET ::     BINT64 dostws BINT1 #>HXS     OVER ONE_DO       DUP ONE{}N 4UNROLL       DUP4UNROLL bitSL     LOOP     DROP' ::       OVER #2* #3+ GETLAM       4PICK bitAND UNROT       3GETLAM FPTR2 ^QMul 1GETLAM       3PICK 3PICKOVER FPTR2 ^#>Z       FPTR2 ^Mod FPTR2 ^Z># #-       DO         INDEX@ #2* #2+ GETLAM         4PICKOVER bitAND         OVER HXS==HXS %0=         ITE_DROP ::           bitNOT 5PICK bitAND           3PICK#1+_ 3PICK INDEX@           FPTR2 ^#>Z FPTR2 ^QAdd           OVER 1GETLAM #<>case           2GETEVAL           ROTDROPSWAP           #2* #2* #3- UNROLL         ;       OVER +LOOP       4DROP     ;     SWAP' NULLLAM OVER #2* #1+     NDUPN DOBIND     1GETLAM ONE_DO       1GETLAM #2* INDEX@       BEGIN         SWAPOVER #/ DROP       #0=UNTIL       DROP #2* #3+ DUP GETLAM       INNERCOMP INDEX@ ROTOVER       #2* #2+ GETLAM bitOR       ROT#1+ {}N SWAP PUTLAM     LOOP     BINT1     BEGIN       #2*       1GETLAM OVER #/ DROP     #0<> UNTIL     1GETLAM OVER#>     IT ::       1GETLAM OVER DO         INDEX@ #2* #3+         DUP GETLAM INNERDUP #1=         ITE 3DROP ::           INDEX@ #3+           DUP GETLAM INNERDUP           #4+PICK SWAPROT           OVER #4+ UNPICK_           {}N SWAP PUTLAM           {}N SWAP PUTLAM         ;       DUP +LOOP     ;     DROP     1GETLAM ONE_DO       INDEX@ #2* #3+ GETLAM       DUPTYPEHSTR?       ITE_DROP ::         INNERCOMP ONE_DO           DUPROT #2* #3+ PUTLAM         LOOP         DROP       ;     LOOP     BINT0 #>HXS bitNOT BINT1 Z0_     2GETEVAL     ABND   ; ERRTRAP ::     1GETLAM dostws ERRJMP   ;   1GETABND dostws ; Times haven't changed significantly, and results are (as expected) identical. The 50g takes 111.6_s for 24 and 377.9_s for 22, which is basically the same as the previously reported times. The laptop still does 60 in about a second, 56 and lower in at most a handful of minutes (I paid a little more attention to the clock this time around, 52 takes the crown at about 7 minutes), and 58 shouldn't change at all from its 2.5 hours, since the bugfix has no effect on numbers of the format 2*prime (the new loop just runs from 4 to n in steps of 4 and encounters a bucket only on 4 itself, which the buggy code took care of too). I would be very surprised if this new loop took more than a few microseconds. RE: Puzzle - RPL and others - Albert Chan - 05-10-2021 03:57 AM (05-07-2021 04:17 PM)3298 Wrote:   (05-07-2021 10:35 AM)Albert Chan Wrote:  Since test for bucket n/2 is same as matching gcd = n/2, we can remove it. Hopefully, this is my final version, by removing stuff from recurse5() / puzzle5(). Nah, there's more to remove. Odd/even is covered by whether the GCD is a multiple of 2 or not. So the odd/even test can go. As hinted at in my answer to your private message, as well as in my previous code, the mod4 test on non-mod4 bases can be worked into the GCD partitioning as well. And as I found out after writing the code, it can also be generalized to higher powers of 2, more specifically to the smallest power of 2 that doesn't divide the base anymore. Per your advise, I did just that, implemented gcd(2n,d) buckets. Odd/even stuff all gone ... For n=60, this smarter version finished in 53 seconds (previously at 3 minutes). Code: do local gcd, recurse = require 'factor'.gcd function puzzle(n)     if n%2 ~= 0 then return end -- n must be even     local lst, bad = {}, {}     for i=1,n do lst[i] = i end     for i=1,n do lst[n+i] = gcd(n,i) end     for i=1,n do lst[n+n+i] = gcd(n+n,i) end     for i=n+1,n+n do            -- get bad buckets         local x = lst[i]         if x ~= lst[n+i] then bad[x] = true end     end     for x in pairs(bad) do         for i=n+n+1,n+n+n do    -- fix bad buckets             if x==lst[i] then lst[i-n] = x+x end         end     end     return recurse(lst, n, 1, {}) end function recurse(lst, n, k, X)     -- print(table.concat({unpack(X,1,k-1)},','))     if k==n then return print('{'..table.concat(X,',')..'}') end     local d = 0     for i=1, k-6, 6 do          -- assume n <= 190         d = ((((((d*n + X[i])*n + X[i+1])*n + X[i+2])*n                     + X[i+3])*n + X[i+4])*n + X[i+5]) % k     end     for i = k-(k-1)%6, k-1 do d = d*n + X[i] end     d = k - d*n%k               -- first valid (mod k)     local s = lst[n+n+k]        -- = gcd(n+n,k)     while d < n do         if s == lst[n+d] and lst[d] then             X[k] = d             lst[d] = false      -- mark taken             recurse(lst, n, k+1, X)             lst[d] = d          -- restore         end         d = d + k     end end end (05-09-2021 01:39 PM)3298 Wrote:  GCD(2n,d) buckets are (new buckets from doubling n are indented further; sorting is the same): - 1: {1 7 11 13 17 19 23 29} - 2: {2 14 22 26} - - 4: {4 8 16 28} - 3: {3 9 21 27} - 6: {6 18} - - 12: {12 24} - 5: {5 25} - 10: {10} - - 20: {20} - 15: {15} - 30: {} (empty because it's the base) - - 60: {} (also empty because it's larger than the base) Just to be sure, I peek at the iterations for n = 30, and it matched "swapped" buckets perfectly. 1 1,4 1,4,3 1,4,3,2 1,4,3,2,5 1,4,3,2,5,12 1,4,3,2,5,12,29 1,4,3,2,5,12,29,26 1,4,3,2,5,12,29,26,21 1,4,3,2,5,12,29,26,21,20 1,4,3,2,5,24 1,4,3,2,5,24,19 1,4,3,2,5,24,19,14 1,4,3,2,5,24,19,14,21 1,4,3,2,5,24,19,14,21,20 1,4,3,2,5,24,19,14,21,20,13 1,4,3,2,5,24,19,14,21,20,13,6 1,4,3,2,5,24,19,14,21,20,13,6,17 1,4,3,2,5,24,19,14,21,20,13,6,17,8 1,4,3,2,5,24,19,14,21,20,13,6,17,8,15 1,4,3,2,5,24,19,14,21,20,13,6,17,8,15,22 ... RE: Puzzle - RPL and others - Albert Chan - 05-10-2021 05:13 PM I was a bit confused about proof of gcd(2n,d) buckets, perhaps this help. Let base n = p*m, where p is power-of-2, m is odd, m > 1 d2p-1 * n + d2p ≡ 0 (mod 2p) d2p ≡ (-p*m) * d2p-1 (mod 2p) d2p / p ≡ -m * d2p-1 (mod 2) = 1 (mod 2) d2p ≡ p (mod 2p) ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ → d2p ≡ d4p ≡ d6p ≡ ... ≡ p (mod 2p) dp-1 * n + dp ≡ 0 (mod p) dp ≡ (-p*m) dp-1 = 0 (mod p) → d1p ≡ d3p ≡ d5p ≡ ... ≡ 0 (mod 2p) ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ // p (mod 2p) were already taken Instead of written in modulo form, we write the possible cases (we just don't know the order) (note: m is odd, so m-1 is even, m-2 is odd ...) d2p, d4p, d6p ... d(m-1)p﻿ ﻿ =﻿ ﻿ 1p, 3p, 5p ... (m-2)p d1p, d3p, d5p ... d(m-2)p﻿ ﻿ =﻿ ﻿ 2p, 4p, 6p ... (m-1)p Note indexes on the left does not match the possible values on the right. But, if we swap the values (thru another indirection), both side matches. Thus, we need to swap split buckets. --- What happen if we extend this to gcd(4n, d) ? d4p-1 * n + d4p ≡ 0 (mod 4p) d4p ≡ (-p*m) * d4p-1 (mod 4p) d4p / p ≡ -m * d4p-1 (mod 4) = ±1 (mod 4) d4p / p ≡ 1 (mod 2) d4p ≡ p (mod 2p) Unfortunately, this is the same result from gcd(2n, d) buckets. RE: Puzzle - RPL and others - 3298 - 05-10-2021 08:23 PM Another private message from Albert Chan, another bug uncovered ... and now fixed. This one is very subtle. Before: Code: bitset_t digit_bitset = 1 << (digit - 1); After: Code: bitset_t digit_bitset = ((bitset_t) 1) << (digit - 1); Without this typecast, it breaks for bitshifts greater than or equal to the number of bits in an int (typically 32 bits), which would be encountered on bases higher than 32. C isn't being very friendly here. But that's the benefit of having an alternative implementation to compare to, since it's quite hard to verify the results by hand: discrepancies between them uncover bugs in one or the other. Thank you for not simply believing my results! RE: Puzzle - RPL and others - Albert Chan - 05-11-2021 11:58 AM (05-10-2021 08:23 PM)3298 Wrote:   Code: bitset_t digit_bitset = ((bitset_t) 1) << (digit - 1);` Without this typecast, it breaks for bitshifts greater than or equal to the number of bits in an int (typically 32 bits), which would be encountered on bases higher than 32. C isn't being very friendly here. Without typecast, it may already be broken with left shift 31 1 << 31 = -2147483648 If bitset_t is 32-bits, not harm done, digit_bitset = 0x80000000 If bitset_t is 64-bits, it messed up, digit_bitset = 0xffffffff80000000 RE: Puzzle - RPL and others - 3298 - 05-11-2021 02:14 PM (05-11-2021 11:58 AM)Albert Chan Wrote:  Without typecast, it may already be broken with left shift 31 1 << 31 = -2147483648 If bitset_t is 32-bits, not harm done, digit_bitset = 0x80000000 If bitset_t is 64-bits, it messed up, digit_bitset = 0xffffffff80000000 That would be sign-extension. I typedef'd bitset_t to be unsigned long long, which should as an unsigned type receive zero-extension instead. Anyway, a shift of 31 only occurs when the base hits 33, and since 33 (as an odd number) is treated with an early exit, the program would run into the undefined result of a shift by 32 at the same time (starting from base=34). Perhaps I should've picked a more forgiving language for the port to a more powerful platform, like, say, Haskell ... C is certainly happy to stab you in the back when you treat it wrong. RE: Puzzle - RPL and others - John Keith - 05-11-2021 03:55 PM (05-11-2021 02:14 PM)3298 Wrote:  Perhaps I should've picked a more forgiving language for the port to a more powerful platform, like, say, Haskell ... C is certainly happy to stab you in the back when you treat it wrong. Though Haskell is one of my favorite languages, I'm not sure it's the best choice for the sort of bit-twiddling that you two are doing here. C is not a very user-friendly language but that is partly due to the access to the "bare metal" that C allows. I have recently been looking at Julia and I am quite interested, as it seems to be designed for mathematical and scientific uses. It also seems to have a large selection of integer data types. Anyway, I am following this thread with interest even though it's well above my pay grade. Carry on!