Post Reply 
[VA] SRC #013 - Pi Day 2023 Special
03-24-2023, 10:40 PM
Post: #21
RE: [VA] SRC #013 - Pi Day 2023 Special
  
Hi, J-F Garnier and EdS2,

J-F Garnier Wrote:So this time we got two solutions from Valentin, plus as often interesting comments and references.

Thanks for your continued appreciation, J-F. I created first solution 1, as doing a sieve was the fastest way to accomplish the goal, but as HP didn't see fit to implement BYTE and/or BOOLEAN arrays in 71 BASIC, like many other high-level languages do, the memory consumption of INTEGER arrays (3 bytes per element) required lots of RAM for large N, and thus I created solution 2, which in the long run gets increasingly slower but has insignificant memory requirements.

J-F Garnier Wrote:My last solution posted just a few hours before Valentin's final solutions probably didn't catch much attention. Yet I was quite happy with the tricks I used to speed up my code by almost a factor of 2.

Oh yes, it did. I noticed it at once but refrained from replying something outright as I intended to include a version of my solution 1 able to run in puny BASIC dialects without using any of the HP-71 BASIC enhancements provided by the Math and JPC ROMs, as EdS2 requested here for using it on BBC BASIC or other "ordinary BASIC" (his words).

J-F Garnier Wrote:One trick was to skip the unnecessary evaluation of the Möbius function for all multiples of 4, i.e. one number over 4 or a 25% gain. But this was not enough to beat Valentin's timings, and even more with his last 5% optimization of his fastest version.

I like the way you never let go until you succeed, and your "tricks" (I'd call them "techniques") are indeed clever and I'm sure interested people will benefit from studying them.

J-F Garnier Wrote:So here is another attempt, pushing the optimisation further towards speed with just a slightly larger code, but no huge memory requirements. [...]

Ditto. However, your use of PRIM means, as I've said, that in the long run that approach can't beat sieve-based methods, because finding a factor (or ensuring that there's none) for large enough numbers is quite costly.

J-F Garnier Wrote:Execution times are now closer ... no, better (slightly) than Valentin's results :-) [...] More results on Emu71/Win, in fast and Authentic modes

Congratulations, but I think you mentioned before that your Emu71/Win is running at ~ 1,500x, I quote:

      "I estimate a peak ratio of ~1500x on my latest Ryzen5 laptop"

while my timings are for a ~ 976x instance, so it would be proper to take that speed difference into account for accurate comparisons of the Emu71/Win timings.

J-F Garnier Wrote:Do we have to stop at 1E12? No! [...] Of course, the final count will have only 12 significant digits, but if is correct within a few ULPs as expected, then we can get an approximation of pi with the same accuracy.

Impressive ! ... Smile ... and that last approximation for N = 1E15, namely 3.14159265359, is a sight to behold ! Congratulations, again !



EdS2 Wrote:I'm thinking, on a platform which lacks the very handy PRIM predicate, one might proceed first with a sieve and then lookup into that for the PRIM calls. [...] It might be that saving the primes in this way is very similar to Valentin's program which records the Möbius values, although in that case I think we still need an implementation of FPRIM.

Here you are, the promised code for your first "related challenge", to run my solution 1 for the HP-71B but using an "unaugmented [sic] 71B, or a similarly ordinary Basic" (your words):

  EDS2    10 lines, 389 bytes

   10  DESTROY ALL @ INPUT T @ SETTIME 0 @ N=IP(SQR(T)) @ GOSUB 70
   20  INTEGER M(N) @ FOR K=1 TO N @ M(K)=1 @ NEXT K @ L=1 @ P=2
  *30  S=P*P @ FOR K=S TO N STEP S @ M(K)=0 @ NEXT K @ FOR K=P TO N STEP P
   40  M(K)=-M(K) @ NEXT K @ L=L+1 @ IF L<=W THEN P=F(L) @ GOTO 30
   50  S=T @ FOR K=2 TO N @ IF M(K) THEN S=S+M(K)*(T DIV (K*K))
   60  NEXT K @ R=SQR(6*T/S) @ DISP T;S;R;ABS(PI-R);TIME @ END

  *70  H=CEIL(N/2) @ INTEGER F(H) @ F(1)=1 @ FOR S=2 TO (SQR(2*H)+1)/2 @ IF F(S) THEN 90
   80  FOR K=2*S*(S-1)+1 TO H STEP 2*S-1 @ F(K)=1 @ NEXT K
  *90  NEXT S @ W=1 @ F(1)=2 @ FOR K=1 TO H @ IF NOT F(K) THEN W=W+1 @ F(W)=2*K-1
  100  NEXT K @ INTEGER F(W) @ RETURN

Description:
    ● Lines 10 to 60 are essentially the same as in my solution 1 for the HP-71B, with changes (described below in Notes) to adapt it to run in plain-vanilla BASIC dialects. In line 10, before beginning the sieve to compute the Möbius function values en masse, the subroutine beginning at line 70 is first called to obtain at once all prime numbers required

    ● Lines 70 to 100, nonexistent in solution 1, are now necessary to implement the functionality that the FPRIM keyword allowed there. This subroutine (called just once, before performing the Möbius sieve) computes and stores in integer array F all prime numbers required by performing yet another sieve. Once completed, the prime numbers identified are placed consecutively in array F, which is then redimensioned to reclaim memory, thus leaving more available to later dimension integer array M for the second sieve.

    ● As was the case with the Möbius sieve, the new prime sieve doesn't use any factoring, multiplications or divisions, for maximum speed. As will be seen in the timings below, this adapted program takes only 24% longer than the original one, which used the assembler keyword FPRIM, so in this battle between my BASIC code vs. JPC Assembler code, I declare myself the winner ! Smile
Notes:
    This is based on my solution 1 for the HP-71B but with the following changes:

    ● I've replaced MAT M=CON, WHILE .. END WHILE, FPRIM and RES by functionally equivalent code, and I've deleted USING "image". No keywords from the Math or JPC ROMs remain. The loop at lines 30-40 can be reimplemented using a WHILE .. WEND construct, if available.

    ● I'm not sure if DIV (integer division) exists in BBC BASIC or other "ordinary" dialects. If it doesn't, A DIV B can be replaced by INT(A/B).

    ● Same thing with H=CEIL(N/2) at line 70. If CEIL isn't available, it will need to be replaced by trivial equivalent code.

    ● For convenience while running and timing the code, I've left in the above code the HP-71B BASIC keywords DESTROY ALL, SETTIME 0 and TIME, which should be either deleted or replaced by equivalent code in other BASIC dialects.

    ● The integer array F is first dimensioned in line 70 as INTEGER F(H), with a variable number of elements H. In BASIC dialects which do not admit dimensioning arrays with a variable number of elements given by an expression, F will probably need to be dimensioned to the fixed maximum number of elements to use. Same thing happens at line 20 INTEGER M(N).

    ● Also, the same array F is later redimensioned in line 100 as INTEGER F(W), reducing its size (because W is always < H) to reclaim memory. If the plain-vanilla dialect can't redimension arrays, or if it can but it doesn't keep intact the unaffected elements (initializes or destroys the whole array's contents,) then this redimensioning will not be possible and memory won't be reclaimed, which will limit the maximum argument N being input to the program.

    The memory savings by redimensioning can be huge. For instance, for N = 100,000,000, redimensioning F reduces it from 5,000 elements to just 1,229 elements.

    ● As it stands now, the program recomputes the array of prime numbers each time it's run. If there's a RAM disk available or other sufficiently fast way to store and retrieve the array, it could be computed to its maximum size and stored just once, to be retrieved in whole or in part when the program is executed afterwards. This would reduce the running time considerably, depending on the retrieval speed vs. the computation speed.
Results:

Upon running the program with the following arguments on a physical/virtual HP-71B, we get these results and timings:
        N       Count     π approx       |Error|       go71b  Emu71/Win  Physical
                                                       @128x    @976x     HP-71B 
    ------------------------------------------------------------------------------
       12345      7503  3.14198204634  0.00038939275   0.12"    0.02"        15"
      100000     60794  3.14155932716  0.00003332643   0.34"    0.04"        44"
      567890    345237  3.14158683822  0.00000581537   0.83"    0.11"     1' 46"
     1000000    607926  3.14159550063  0.00000284704   1.11"    0.15"     2' 22"

    10000000   6079291  3.14158749068  0.00000516291   3.61"    0.47"     7' 42"
    25000000  15198180  3.14159239999  0.00000025360   5.77"    0.76"    12' 19"
    33000000  20061593  3.14159276017  0.00000010658   6.65"    0.87"    14' 11"

       1E8    60792694  3.14159307180  0.00000041821  11.71"    1.54"    24' 59"
       1E9   607927124  3.14159259637  0.00000005722  38.00"    4.98"    1h  21'

That's all. Hope it's useful, or at least enjoyable. Tell me how it goes when running it on your beloved BBC BASIC.

V.

  
All My Articles & other Materials here:  Valentin Albillo's HP Collection
 
Visit this user's website Find all posts by this user
Quote this message in a reply
03-25-2023, 06:02 PM (This post was last modified: 03-29-2023 02:17 AM by DavidM.)
Post: #22
RE: [VA] SRC #013 - Pi Day 2023 Special
Using the Möbius-based algorithm already presented by 2old2randr and Valentin above, I put together the following SysRPL implementation to see how it would perform on a real 50g. I was curious as to how a pure SysRPL implementation would compare to the hybrid version already posted by 2old2randr.

This particular implementation makes use of a customized version of the Möbius function that Gerald H created. Instead of calling the function code as a subroutine, it is simply executed inline in the loop that accumulates the final count. I also wanted the result as an approximate number instead of an exact integer in order to reduce the need for subsequent conversions. Otherwise, it's the same method as Gerald's original Möbius function.

The rest of the code is fairly standard stack-based computations. As with any SysRPL implementation, there's an argument check at the beginning. This is followed by the sum-loop, and then the final computation of SQRT(6*N/Count):

Edit: this code has been updated to correct an issue with the computation of Count (identified below in this thread)
!NO CODE
!RPL
::
   CK1NOLASTWD
   CK&DISPATCH1
   real ::
      %1 OVER %> caseSIZEERR                    ( N must be a positive integer <= [1E12-1] )
      DUP %IP OVER %<> caseSIZEERR
      DUP % 9.99999999999E11 %> caseSIZEERR

      DUP %SQRT %IP>#                           ( maxloop: IP[SQRT[N]] )
      %1                                        ( initial i value )
      %0                                        ( initial S subtotal )
      ROT ZERO_DO                               ( for i = 0 to [maxloop-1] )
         OVERDUP %1 %= ?SKIP ::                 ( skip mu[i] if i=1 )
            FPTR2 ^R>Z                          ( convert i to zint for factoring )
            FPTR2 ^MZSQFF                       ( obtain factor meta )
            #2/                                 ( convert meta size to pair count )
            %1 SWAP                             ( seed mu[] result with 1 )
            ZERO_DO                             ( Gerald's mu[] method: )
               ROTDROPSWAP                      (   check each factor magnitude, )
               %1 %= ITE %CHS DROP%0_           (   alter current mu[] result for given value )
            LOOP                                ( next factor )
         ;                                      ( end of mu[] )
         2OVER                                  ( SL2: N, SL1: i )
         DUP %*                                 ( i^2 )
         2DUP %MOD                              ( N MOD i^2 )
         ROTSWAP %-                             ( Q = N - [N MOD i^2] )
         SWAP %/                                ( Q = Q / i^2 )
         %*                                     ( Q * mu[i] )
         %+SWAP                                 ( increment/decrement S, swap with i )
         %1+ SWAP                               ( increment i, swap with S )
      LOOP                                      ( next i )
      SWAPDROPDUP                               ( drop i, dup S )
      %6 4ROLL %*                               ( 6*N )
      SWAP %/                                   ( 6*N/S )
      %SQRT                                     ( SQRT[6*N/S] )
   ;
;
@

Size: 182.5 bytes
CRC: #79C1h


Here's the results of similar inputs to those posted by others. Except where noted, the runtimes listed are averages of 3 runs:

                                                                 Physical 50g
           N             Count   Approximation         |error|        Runtime
------------   ---------------   -------------   -------------   ------------

          10                 7   2.92770021885   0.21389243474    00:00:00.05
      12,345             7,503   3.14198204634   0.00038939275    00:00:02.36
     100,000            60,794   3.14155932716   0.00003332643    00:00:06.65
     567,890           345,237   3.14158683822   0.00000581537    00:00:15.95
   1,000,000           607,926   3.14159550063   0.00000284704    00:00:21.22
  10,000,000         6,079,291   3.14158749068   0.00000516291    00:01:09.68
  25,000,000        15,198,180   3.14159239999   0.00000025360    00:01:52.44
  33,000,000        20,061,593   3.14159276017   0.00000010658    00:02:09.21
         1E8        60,792,694   3.14159307180   0.00000041821    00:03:51.44
         1E9       607,927,124   3.14159259637   0.00000005722    00:12:55.88
        1E10     6,079,270,942   3.14159267337   0.00000001978    00:44:36.81
        1E11    60,792,710,280   3.14159265115   0.00000000244    02:39:38.69 *
      1E12-1   607,927,102,274   3.14159265250   0.00000000109    10:03:57.12 *

      
* Time shown is for a single run
Find all posts by this user
Quote this message in a reply
03-26-2023, 01:31 AM (This post was last modified: 03-26-2023 01:32 AM by 2old2randr.)
Post: #23
RE: [VA] SRC #013 - Pi Day 2023 Special
Thank you, David for that very nicely annotated (and indented) code. I am learning Sys RPL and example programs are very helpful.

A couple of questions regarding your program - what are the two lines (!NO CODE and !RPL) at the beginning of the program for? Secondly, what is the meaning of the underscore in DROP%0_?

Thanks
Sudhir
Find all posts by this user
Quote this message in a reply
03-26-2023, 03:37 AM
Post: #24
RE: [VA] SRC #013 - Pi Day 2023 Special
(03-26-2023 01:31 AM)2old2randr Wrote:  ...what are the two lines (!NO CODE and !RPL) at the beginning of the program for? Secondly, what is the meaning of the underscore in DROP%0_?

"!NO CODE" and "!RPL" are directives for the built-in MASD compiler that designate the appropriate modes for translation and output. They are specific to that particular compiler, and may not be needed if you set system flag -92. Other compilers (such as the HP Developer Tools) may not recognize them at all.

The underscore suffix is a convention used by the HP designers to indicate that the opcode is for an entry point that is not official, but is in a safe area and can be used in programs as needed. The underscore is actually part of the opcode's name, and doesn't have any special meaning for the compiler. You may also see certain SysRPL opcodes written within parentheses, which is another way to designate the same thing. You just need to know what your particular compiler is expecting.
Find all posts by this user
Quote this message in a reply
03-27-2023, 02:02 PM
Post: #25
RE: [VA] SRC #013 - Pi Day 2023 Special
Thanks Valentin, David, J-F. I'm hoping to be able to apply some brain power to this and will post if I get something working. If I don't post, it's because I haven't marshalled the effort.
Find all posts by this user
Quote this message in a reply
03-27-2023, 02:28 PM
Post: #26
RE: [VA] SRC #013 - Pi Day 2023 Special
I hadn't noticed that the final Count value I listed in the table in post #22 was a bit off until Valentin sent me a note about it. We both assumed a typo, but it turns out not to be. The program I posted definitely produces the Count value specified for an input of 999,999,999,999.

While it certainly feels like some sort of rounding discrepancy, I haven't yet found the spot where things break down. I'm still working on it as time permits, but I wanted to let everybody know that it appears there's still another "puzzle in the puzzle". At least in the version I posted. Smile
Find all posts by this user
Quote this message in a reply
03-27-2023, 03:48 PM
Post: #27
RE: [VA] SRC #013 - Pi Day 2023 Special
(03-27-2023 02:28 PM)DavidM Wrote:  I hadn't noticed that the final Count value I listed in the table in post #22 was a bit off until Valentin sent me a note about it. We both assumed a typo, but it turns out not to be. The program I posted definitely produces the Count value specified for an input of 999,999,999,999.

While it certainly feels like some sort of rounding discrepancy, I haven't yet found the spot where things break down. I'm still working on it as time permits, but I wanted to let everybody know that it appears there's still another "puzzle in the puzzle". At least in the version I posted. Smile

I didn't notice this discrepancy !
However ... running Valentin's 2nd solution also returns the same wrong value:

1 DESTROY ALL @ INPUT T @ SETTIME 0 @ U=-1 @ S=T @ FOR K=2 TO SQR(T) @ N=K @ F=1
2 D=PRIM(N) @ IF NOT D THEN S=S+U^F*IP(T/(K*K)) ELSE IF MOD(N,D*D) THEN F=F+1 @ N=N/D @ GOTO 2
3 NEXT K @ DISP USING "2(3DC3DC3DC3D,2X),2(Z.8D,X),5DZ.2D";T,S,SQR(6*T/S),ABS(PI-RES),TIME

>RUN
?1E12-1
999,999,999,999 607,927,102,272 3.14159265 0.00000000 178.90


J-F
Visit this user's website Find all posts by this user
Quote this message in a reply
03-27-2023, 07:46 PM
Post: #28
RE: [VA] SRC #013 - Pi Day 2023 Special
.
Hi, J-F and DavidM,

(03-27-2023 03:48 PM)J-F Garnier Wrote:  
(03-27-2023 02:28 PM)DavidM Wrote:  I hadn't noticed that the final Count value I listed in the table in post #22 was a bit off until Valentin sent me a note about it. We both assumed a typo, but it turns out not to be. The program I posted definitely produces the Count value specified for an input of 999,999,999,999 [...]

I didn't notice this discrepancy !
However ... running Valentin's 2nd solution also returns the same wrong value:

1 DESTROY ALL @ INPUT T @ SETTIME 0 @ U=-1 @ S=T @ FOR K=2 TO SQR(T) @ N=K @ F=1
2 D=PRIM(N) @ IF NOT D THEN S=S+U^F*IP(T/(K*K)) ELSE IF MOD(N,D*D) THEN F=F+1 @ N=N/D @ GOTO 2
3 NEXT K @ DISP USING "2(3DC3DC3DC3D,2X),2(Z.8D,X),5DZ.2D";T,S,SQR(6*T/S),ABS(PI-RES),TIME

>RUN
?1E12-1
999,999,999,999 607,927,102,272 3.14159265 0.00000000 178.90

Shocking ! I didn't notice at the time I posted my two original solutions because I never saw the wrong count, just the correct one. A little trip down the memory lane reveals what happened and why I didn't saw it:

When I ran my solution 2 for various N, the last one I entered at the input prompt was 1E12, exactly, without the " -1" used to make it the largest 12-digit integer 999,999,999,999. The program merrily run and output this:
    >RUN
          ?1E12
                WRN L3:IMAGE Ovfl
                ***,***,***,*** 607,927,102,274 [...]
and I thought, "Gosh, the image 3DC3DC3DC3D caters only for 12 digits and 1E12 = 1,000,000,000,000 which has 13, thus the image overflow. But the output count, 607,927,102,274, is fully correct as per tables, so no problem."

Thus, instead of changing the already sizeable image, and as Moebius(1E12) = 0 (because 1E12 is divisible my many squares 4, for instance,) it's adding nothing to the tally, so the count for 1E12-1 must be the same value, 607,927,102,274, and thus I simply posted the input as 1E12-1 to avoid the ungainly IMAGE overflow, fully expecting that the program would produce the exact same value as the one computed for 1E12 proper so I didn't bother to check it out by actually running it again.

Alas, as J-F discovered, it does not, possibly because of some rounding error, but the solution is quite simple because the only operations which can result in such errors are the square root (which is innocent) and divisions. A little examination reveals that the culprit is IP(T/(K*K)) at line 2, and changing it to (T DIV (K*K)), like this:

      2 D=PRIM(N) @ IF NOT D THEN S=S+U^F*(T DIV (K*K)) ELSE IF MOD(N,D*D) THEN F=F+1 @ N=N/D @ GOTO 2

completely solves the problem (177 bytes):
    1 DESTROY ALL @ INPUT T @ SETTIME 0 @ U=-1 @ S=T @ FOR K=2 TO SQR(T) @ N=K @ F=1
    2 D=PRIM(N) @ IF NOT D THEN S=S+U^F*(T DIV (K*K)) ELSE IF MOD(N,D*D) THEN F=F+1 @ N=N/D @ GOTO 2
    3 NEXT K @ DISP USING "2(3DC3DC3DC3D,2X),2(Z.8D,X),5DZ.2D";T,S,SQR(6*T/S),ABS(PI-RES),TIME

    >RUN
          ?1E12-1
               999,999,999,999  607,927,102,274 [...]
Probably something like this may be affecting DavidM's program and the solution might possibly be along the same lines.

Best regards.
V.

  
All My Articles & other Materials here:  Valentin Albillo's HP Collection
 
Visit this user's website Find all posts by this user
Quote this message in a reply
03-28-2023, 03:10 PM (This post was last modified: 03-29-2023 02:22 AM by DavidM.)
Post: #29
RE: [VA] SRC #013 - Pi Day 2023 Special
(03-27-2023 07:46 PM)Valentin Albillo Wrote:  Probably something like this may be affecting DavidM's program and the solution might possibly be along the same lines.

That was indeed the problem. I've updated the SysRPL code in post #22 to reflect a fix for the issue.

Unfortunately there is no built-in SysRPL operator for integer division with real/approximate arguments, so I couldn't use the same approach that Valentin was able to use.

I could have opted to refactor the code using ZINTs (exact integers) so as to use the supported integer division operator for those, but that would reduce the performance gains that I was seeking. A simple test showed about a 10% runtime hit going that route.

While there's no supported integer division operator that I could tap into, there is a %MOD operator that allowed me to "back into" the integer quotient by subtracting the remainder from N prior to dividing by i^2. Although somewhat convoluted, it works in this situation and only cost about a 2-3% penalty in runtime.

I'm still working on documenting the new runtimes, and will update post #22 when I've got the corrected data.

Edit: Runtimes have now been updated as well as the program code.
Find all posts by this user
Quote this message in a reply
03-29-2023, 01:16 AM
Post: #30
RE: [VA] SRC #013 - Pi Day 2023 Special
.
Hi, all,

Adding to this:

Valentin Albillo Wrote:Alas, as J-F discovered, it does not, possibly because of some rounding error, but the solution is quite simple because the only operations which can result in such errors are the square root (which is innocent) and divisions. A little examination reveals that the culprit is IP(T/(K*K)) at line 2, and changing it to (T DIV (K*K)), like this: [...]

Indeed IP(T/(K*K)) and T DIV (K*K), which would appear at first sight to be equivalent, do really differ at times (though very rarely and for large values of T, it seems,) when the former's rounding does not match the latter's truncation.

A trivial program I wrote (relatively) quickly finds all mismatches for various very large integer T and for K from 2 to IP(√T) (i.e. ~ one million possible cases for the first eight values of T listed):
          T              # Mismatches            K
    -----------------------------------------------------------
    999,999,999,999      31 instances      2, 5, 8, 16, 20, ...
    999,999,999,998      19 instances      2, 3, 8, 20, 25, ...
    999,999,999,997      12 instances      3, 8, 25, 80, ...
    999,999,999,996       3 instances      3, 3125, 31250, ...
    999,999,999,995       3 instances      2, 3, 254
    999,999,999,994       1 instance       254
    999,999,999,993       1 instance       254
    999,999,999,992       0 instances       -
     99,999,999,999       0 instances       -
As you can see, for T = 999,999,999,999 there are 31 different instances (in about a million) where IP(T/(K*K)) differs from T DIV (K*K), for K ranging from 1 to IP(√T). The instances begin at K = 2 (249,999,999,999 vs. 250,000,000,000, respectively) and end at K = 500,000 (3 vs. 4, respectively).

Doing the same with T = 999,999,999,998, there's just 19 instances reported instead of 31, and with T = 999,999,999,997 just 12. By the time T equals 999,999,999,995, a mere 3 faulty instances remain (namely for K = 2, 3 and 254), then 999,999,999,994 and 999,999,999,993 have just the one mismatch (in a million !) and for 999,999,999,992 and below there seems to be none.

Also, as expected, running this small program for input values with less than 12 digits, say T = 99,999,999,999 instead, i.e. 1E11 - 1, no instances of mismatches appear at all, and probably the same happens for all smaller T.

V.

  
All My Articles & other Materials here:  Valentin Albillo's HP Collection
 
Visit this user's website Find all posts by this user
Quote this message in a reply
03-29-2023, 03:42 PM (This post was last modified: 03-29-2023 04:20 PM by EdS2.)
Post: #31
RE: [VA] SRC #013 - Pi Day 2023 Special
Right then, after a false start yesterday afternoon, resulting in a need for aspirin and a nap, I've managed to do a bit of coding. I chose J-F's simplest (first) submission as a template, and wanted to make use of sieve code from a previous discussion on Stardot. After a great deal of uglification and debugging, and using some more of J-F's ideas, I have the following (can run it here):

100 REM based on SRC13 by J-F Garnier (first attempt)
110 DATA 1000,12345,1E5,1E6
120 REPEAT:READ N%
130 T%=TIME
140 L%=SQR(N%)
150 REM start with a sieve of smallest prime factors
160 Z%=SQR(L%)
170 DIM V% L%+5
180 FORI%=1TOL%+1STEP4:V%!I%=0:NEXT
190 P%=2:REPEATFORI%=P%*P%TOL%STEPP%:IFV%?I%ELSEV%?I%=P%
200 NEXT:REPEATP%=P%+1:UNTILV%?P%=0
210 UNTILP%>Z%:IF P%>255 STOP
220 REM now the counting of the squarefree
230 S%=N%:FOR I%=2 TO L%
240 R%=I%:C%=-1:Q%=1
250 P%=V%?R%:IFP%ELSEP%=R%
260 IFP%=Q%M%=0ELSEIFP%=R%M%=C%ELSEC%=-C%:R%=R%DIVP%:Q%=P%:GOTO250
270 S%=S%+M%*(N%DIV(I%*I%)):I%=I%+1
280 R%=I%:C%=-1:Q%=1
290 P%=V%?R%:IFP%ELSEP%=R%
300 IFP%=Q%M%=0ELSEIFP%=R%M%=C%ELSEC%=-C%:R%=R%DIVP%:Q%=P%:GOTO290
310 S%=S%+M%*(N%DIV(I%*I%)):I%=I%+2
320 R%=I%:C%=-1:Q%=1
330 P%=V%?R%:IFP%ELSEP%=R%
340 IFP%=Q%M%=0ELSEIFP%=R%M%=C%ELSEC%=-C%:R%=R%DIVP%:Q%=P%:GOTO330
350 S%=S%+M%*(N%DIV(I%*I%))
360 NEXT
370 T%=TIME-T%
380 PRINT N%" "S%" "T%/100"s"
390 U.FA.:END


The timings and counts are as follows:

       1000       608      0.33s
      12345      7503      1.20s
     100000     60794      3.59s
     567890    345237      8.98s
    1000000    607926     12.07s


I was able to push this as far as 7.8E8 with result 474183174, which I can't readily check, using a faster unrealistic emulation mode. (Can't get Wolfram to give me large results - any tips?)

Other than that, all the above I ran in accurate emulation of a BBC Micro, a 2MHz 6502 machine with 32k RAM and a rather good 16k Basic. There are models with more available RAM and faster CPUs and I should be able to get to 1E9 one way or another, and probably higher.

The only peculiarity of BBC Basic which you might need to understand is the use of DIM, ? and ! to allocate and directly access RAM. Fairly obviously, the use of the % sign is to denote integer variables.

Thanks to everyone for the materials and discussion! And to Valentin for the challenge. I've posted over on Stardot too, of course.

I note that some vintage HP desktops offer a Basic, it might be interesting to see what they can do.
Find all posts by this user
Quote this message in a reply
04-01-2023, 11:39 PM
Post: #32
RE: [VA] SRC #013 - Pi Day 2023 Special
  
Hi, all,

To end this thread, there's the subject of finding values of N which result in an approximation to \(\pi\) much closer than what would be statistically expected. Of course, the larger N, the better the appoximation on the long run, but perhaps there are values of N which defy the odds and result in unexpectedly close approximations.

To settle the matter, this 179-byte 4-liner finds them up to a given maximum N very, very quickly, listing all successive record-breakers:
    1  DESTROY ALL @ INPUT N @ SETTIME 0 @ INTEGER M(N) @ MAT M=CON @ P=2 @ WHILE P<=N @ S=P*P
    2  FOR K=S TO N STEP S @ M(K)=0 @ NEXT K @ P=FPRIM(P+1) @ END WHILE @ V=10 @ S=7
    3  FOR T=11 TO N @ S=S+(M(T)#0) @ X=SQR(6*T/S) @ Y=ABS(PI-X) @ IF Y<V THEN V=Y @ DISP T;X;V
    4  NEXT T @ DISP "OK";TIME
Let's find all record-breakers for N up to 30,000 (requires ~ 90 Kb of RAM):
    >RUN ->  ? 30000

         N       Pi Approx.       |Error|
       -------- -----------------------------
       (smallest values, which result in irrelevant approximations, not listed)     
          28   3.14362099197   0.00202833838
         153   3.14180962853   0.00021697494
         426   3.14145282771   0.00013982588
         862   3.14169206124   0.00009940765
         931   3.14153751379   0.00005513980
         936   3.14164722334   0.00005456975
         982   3.14155164428   0.00004100931
        1033   3.14156437967   0.00002827392
        1061   3.14161860223   0.00002594864
        1135   3.14158641730   0.00000623629
        1186   3.14159601478   0.00000336119
        2094   3.14159185312   0.00000080047
        5147   3.14159305181   0.00000039822
        5374   3.14159277156   0.00000011797
        7241   3.14159270516   0.00000005157
       14709   3.14159260812   0.00000004547
       25684   3.14159262180   0.00000003179


    OK timing (go71b: 23.37", Emu71/Win: 3.06", physical HP-71B: 49' 51")
Adding up more RAM to a virtual/physical HP-71B, we can go further, say up to N = 100,000 using ~300 Kb. We obtain the following additional record-breakers:
       65623   3.14159266166   0.00000000807
       67490   3.14159265758   0.00000000399
       89440   3.14159265330   0.00000000029

The above list of record-breakers would seem to end the discussion, but there's something which doesn't look good, the fact that there are several series of them which feature very close N and errors, such as these:
    931 3.14153751379 0.00005513980
    936 3.14164722334 0.00005456975
    982 3.14155164428 0.00004100931
looking almost redundant, as they're not that different.

It would seem preferable to find and list only those record-breakers which are a significant improvement over their predecessors, where "significant improvement" obeys some suitable criterium, and this 6-liner implements just that:
    1  DESTROY ALL @ INPUT N @ SETTIME 0 @ INTEGER M(N) @ MAT M=CON @ P=2 @ WHILE P<=N @ S=P*P
    2  FOR K=S TO N STEP S @ M(K)=0 @ NEXT K @ P=FPRIM(P+1) @ END WHILE @ U=1 @ V=10 @ Q=1 @ S=7
    3  FOR T=11 TO N @ S=S+(M(T)#0) @ X=SQR(6*T/S) @ Y=ABS(PI-X) @ IF Y>=V THEN 6
    4  W=T/Q @ Z=V/Y @ H=(Z-1)/(W-1) @ IF H<3 THEN 6
    5  Q=T @ V=Y @ DISP T;X;V;FNR(Z,3);FNR(W,3);FNR(H,3)
    6  NEXT T @ DISP "OK";TIME @ DEF FNR(N,D)=IROUND(10^D*N)/10^D
    >RUN ->  ? 30000

         N       Pi Approx.       |Error|      Epre/E  N/Npre    Merit
       -------- -------------------------------------------------------
       (smallest values, which result in irrelevant approximations, not listed)   
        1135   3.14158641730   0.00000623629  325.248  40.536    8.201
        1186   3.14159601478   0.00000336119    1.855   1.045   19.036
        2094   3.14159185312   0.00000080047    4.199   1.766    4.178 
        5374   3.14159277156   0.00000011797    6.785   2.566    3.693
        7241   3.14159270516   0.00000005157    2.288   1.347    3.706


    OK timing (go71b: 25.4", Emu71/Win: 3.33", physical HP-71B: 54' 11")
Again, using more RAM we can search up to, say, N = 100,000 and we get two more:
       67490   3.14159265758   0.00000000399    2.023   1.028   35.942
       89440   3.14159265330   0.00000000029   13.759   1.325   39.229
      
Why do we exclude, e.g. 5,147 from the list ? Because 5,147 's error is 2.010x smaller than its predecessor's, 2,094, but at the cost of 2.458x larger N, so it's no real improvement over 2,094 and doesn't really pay off having to waste so much more time for such a meager improvement. Thus, its merit factor is too low, doesn't meet the suitability criterium and is excluded from the list; these are the relevant numbers:
    2094  3.14159185312  0.00000080047  4.199  1.766  4.178
    5147  3.14159305181  0.00000039822  2.010  2.458  0.693
and the same goes for all other omitted values. Only the ones which result in a marked improvement as per the criterium are deemed worthy to be listed.

Finally, we can do the search for N much greater than 100,000, up to 1,500,000 and beyond by using boolean operators and variables and I've written such a version of this program implementing them, but that's left to the reader as an exercise ... Smile

Thanks to all of you who participated, much appreciated and I hope you enjoyed it all.

V.

  
All My Articles & other Materials here:  Valentin Albillo's HP Collection
 
Visit this user's website Find all posts by this user
Quote this message in a reply
04-08-2023, 10:41 PM
Post: #33
RE: [VA] SRC #013 - Pi Day 2023 Special
  
Hi again !

I thought I was done with this thread for good but while converting it to a PDF document for uploading it to my HP site's HP Calculator Challenges section, I realized there was a bit of unfinished business which had eluded me so far.

Namely, I said in Post #15 the following, I quote:
    "As for the Möbius function {μ(N) henceforth}, it's a very important number-theoretical function which is easily computed in various ways, like this simple HP-71B user-defined function FNM (which should be placed at the end of any program using it):  {requires the JPC ROM}

       1  DEF FNM(N) @ IF N=1 THEN FNM=1 @ END ELSE F=1
       2  D=PRIM(N) @ IF NOT D THEN FNM=(-1)^F ELSE IF MOD(N,D*D) THEN F=F+1 @ N=N/D @ GOTO 2

          >FOR N=96673 TO 96686 @ FNM(N); @ NEXT N @@  ►  1 1 0 0 0 0 0 0 1 1 1 0 -1 -1
    [...]

    [ ... Good and Churchhouse said ...] 'Thus the Möbius sequence contains arbitrarily long runs of zeros, but these long runs presumably occur extremely rarely.'

    I showed above such a run of zeros from N = 96675 to 96680, six consecutive zeros in all. It would make for an interesting challenge to locate longer runs."

So, here I address this remaining sub-challenge by creating an HP-71B program to find the first values of N which are the start of a run of 1, 2, 3, ..., consecutive zeros of μ(N). Specifically, this 169-byte 4-liner finds them up to a given maximum N very quickly:   {requires the JPC ROM}
    1  DESTROY ALL @ INPUT N @ SETTIME 0 @ INTEGER M(N) @ P=2 @ WHILE P<=N @ S=P*P
    2  FOR K=S TO N STEP S @ M(K)=1 @ NEXT K @ P=FPRIM(P+1) @ END WHILE @ Z=0 @ U=0
    3  FOR K=1 TO N @ IF M(K) THEN Z=Z+1 ELSE IF Z>U THEN U=Z @ DISP U;K-Z @ Z=0 ELSE Z=0
    4  NEXT K @ DISP "OK";TIME


    ●  Lines 1-2 perform the initialization and identify the zeros of μ(N), while line 3 contains all the logic to locate the start of the first run of 1, 2, 3, ..., consecutive zeros.

Let's find the first initial N for L = 1, 2, 3, ..., consecutive zeros for N up to 25,000 (requires ~ 75 Kb of RAM):
    >RUN ->  ? 25000

    L    Initial N
    ----------------
    1          4
    2          8
    3         48
    4        242
    5        844
    6      22020


    OK timing (go71b: 10.60", Emu71/Win: 1.39", physical HP-71B: 22' 37")
This means that e.g. for the case L = 5 we have that μ(844), μ(845), μ(846), μ(847) and μ(848) are all zero, thus the first run of 5 consecutive zeros of μ(N) starts at N = 844. This also means that all five numbers in the run are divisible by some square, namely:
    844 = 22 x 211;  845 = 5 x 132;  846 = 2 x 32 x 47;  847 = 7 x 112;  848 = 24 x 53
Adding up more RAM and/or using boolean arrays and operations we'd be able to extend the search up to N = 1,200,000, say, which would allow us to find two additional runs of lengths 7 and 8, respectively:
    7     217070
    8    1092747
Longer runs would require much more complex programming and longer execution times.

V.

  
All My Articles & other Materials here:  Valentin Albillo's HP Collection
 
Visit this user's website Find all posts by this user
Quote this message in a reply
Post Reply 




User(s) browsing this thread: 3 Guest(s)