Post Reply 
Speed of HP-49G Routine for Tektronix Vector Graphics Terminal
08-06-2020, 02:13 PM (This post was last modified: 08-06-2020 03:04 PM by 3298.)
Post: #24
RE: Speed of HP-49G Routine for Tektronix Vector Graphics Terminal
(08-06-2020 11:21 AM)Werner Wrote:  The gc does not update the Saturn RSTK. So if one happens during the PUSHhxs call, and the code has moved, you will not return to 'GOSUB label2' at all. So you will never end up with a difference. You will just crash.
I wasn't quite sure whether it would adjust RSTK levels. But I specifically dropped the small HXS so the new one would overwrite part of the code (to trigger a visible crash). If I dropped the large one (>200KB in my test on a mostly empty x49gp instance), the GC might well copy the code object to a place so far away that the old copy (now in unallocated space) may survive without getting overwritten, permitting it to finish its job without getting disturbed.
Dropping the small one should free up just enough space for the new HXS, as they are the same size.

I retested it just now, and while I somehow cannot trigger panic mode anymore (I wonder what changed? This is on the same x49gp instance as before), it now works all the way down to BINT38 (while returning FALSE). BINT37 causes "Insufficient Memory". Just for you I also changed the code to drop the big HXS instead (ROTDROP instead of SWAPDROP) - no change: fails for BINT37, works with return value FALSE for BINT38. No crash whatsoever.
Then I changed it so another object is put into TEMPOP behind the code which remains there while it's running. The intention is to have something in addition to the code-pushed HXS that would overwrite the end of the code in its pre-move position. I made it larger than the dropped small HXS (32 instead of 16 nibbles of data) so the GC wouldn't try to be smart and swap it into the place of the dropped one. Of course, all that means the number has to be adjusted up. Finally, I stumbled into a warmstart at 150 - but weirdly, it just works at BINT131.

I don't know what's going on anymore. Garbage collector and/or memory allocator, are you drunk?

Oh well. Here's a version (of the Saturn ASM program that's actually for this topic) which sidesteps the issue by running pretty much the entire code from the 80100 scratch area. No GC interference there. I could've put only the part starting at the PUSHhxs call into that area (or even only the part after it if I load 80100 into RSTK manually and call PUSHhxs with a GOVLNG instead), but since the ARM example in the MASD docs copies 28 ARM instructions weighing 8 nibbles each into that place, it's probably large enough for 43 Saturn instructions of varying length if I counted correctly (pretty much all of them shorter, except for the LA/LC pair loading the AND/OR masks; in many cases significantly shorter).

Code:
::
  CK2&Dispatch
  REALREAL
  ::
    COERCE2
    CODE
      GOSUB end
*begin
      GOSBVL POP2# ; X in C.A, Y in A.A
      R0=C.A ; SAVPTR uses C.A, so it has to be backed up
      GOSBVL SAVPTR
      C=R0.A
      B=C.B ; B = X8.X7.X6.X5.X4.X3.X2.X1
      BSL.X ; B = X8.X7.X6.X5.X4.X3.X2.X1.0.0.0.0
      BSL.A ; B = X8.X7.X6.X5.X4.X3.X2.X1.0.0.0.0.0.0.0.0
      BSL.A ; B = X8.X7.X6.X5.X4.X3.X2.X1.0.0.0.0.0.0.0.0.0.0.0.0
      B=C.X ; B = X8.X7.X6.X5.X4.X3.X2.X1.X12.X11.X10.X9.X8.X7.X6.X5.X4.X3.X2.X1
      BSRB.X ; B = X8.X7.X6.X5.X4.X3.X2.X1.0.X12.X11.X10.X9.X8.X7.X6.X5.X4.X3.X2
      P=6
      BSL.WP ; B = X8.X7.X6.X5.X4.X3.X2.X1.0.X12.X11.X10.X9.X8.X7.X6.X5.X4.X3.X2.0.0.0.0
      B=A.B ; B = X8.X7.X6.X5.X4.X3.X2.X1.0.X12.X11.X10.X9.X8.X7.X6.Y8.Y7.Y6.Y5.Y4.Y3.Y2.Y1
      BSL.WP ; B = X8.X7.X6.X5.X4.X3.X2.X1.0.X12.X11.X10.X9.X8.X7.X6.Y8.Y7.Y6.Y5.Y4.Y3.Y2.Y1.0​​.0.0.0
      P=0
      B=A.P ; B = X8.X7.X6.X5.X4.X3.X2.X1.0.X12.X11.X10.X9.X8.X7.X6.Y8.Y7.Y6.Y5.Y4.Y3.Y2.Y1.Y​​4.Y3.Y2.Y1
      B+B.W ; B = X8.X7.X6.X5.X4.X3.X2.X1.0.X12.X11.X10.X9.X8.X7.X6.Y8.Y7.Y6.Y5.Y4.Y3.Y2.Y1.Y​​4.Y3.Y2.Y1.0
      B+B.W ; B = X8.X7.X6.X5.X4.X3.X2.X1.0.X12.X11.X10.X9.X8.X7.X6.Y8.Y7.Y6.Y5.Y4.Y3.Y2.Y1.Y​​4.Y3.Y2.Y1.0.0
      CBIT=0.2 ; clear X3 so it doesn't interfere with Y1
      CBIT=0.3 ; same for X4
      B+C.P ; B = X8.X7.X6.X5.X4.X3.X2.X1.0.X12.X11.X10.X9.X8.X7.X6.Y8.Y7.Y6.Y5.Y4.Y3.Y2.Y1.Y​​4.Y3.Y2.Y1.X2.X1
      BSL.W ; B = X8.X7.X6.X5.X4.X3.X2.X1.0.X12.X11.X10.X9.X8.X7.X6.Y8.Y7.Y6.Y5.Y4.Y3.Y2.Y1.Y​​4.Y3.Y2.Y1.X2.X1.0.0.0.0
      BSL.W ; B = X8.X7.X6.X5.X4.X3.X2.X1.0.X12.X11.X10.X9.X8.X7.X6.Y8.Y7.Y6.Y5.Y4.Y3.Y2.Y1.Y​​4.Y3.Y2.Y1.X2.X1.0.0.0.0.0.0.0.0
      ASR.X ; A.X = 0.0.0.0.Y12.Y11.Y10.Y9.Y8.Y7.Y6.Y5
      A+A.X ; A.X = 0.0.0.Y12.Y11.Y10.Y9.Y8.Y7.Y6.Y5.0
      ASR.X ; A.X = 0.0.0.0.0.0.0.Y12.Y11.Y10.Y9.Y8
      B=A.B ; B = X8.X7.X6.X5.X4.X3.X2.X1.0.X12.X11.X10.X9.X8.X7.X6.Y8.Y7.Y6.Y5.Y4.Y3.Y2.Y1.Y​​4.Y3.Y2.Y1.X2.X1.0.0.0.Y12.Y11.Y10.Y9.Y8
      LA 4020606020
      LC 1F1F1F0F1F
      P=9
      B&C.WP ; B = X8.X7.X6.X5.X4.X3.0.0.0.X12.X11.X10.X9.X8.0.0.0.Y7.Y6.Y5.Y4.Y3.0.0.0.0.Y2.Y​​1.X2.X1.0.0.0.Y12.Y11.Y10.Y9.Y8
      A!B.WP ; B = 0.1.0.X8.X7.X6.X5.X4.X3.0.1.0.X12.X11.X10.X9.X8.0.1.1.Y7.Y6.Y5.Y4.Y3.0.1.1.​​0.Y2.Y1.X2.X1.0.1.0.Y12.Y11.Y10.Y9.Y8
      GOSBVL GETPTR
      GOSBVL PUSHhxs ; the object is now laid out like this: (5) DOHXS (5) length=0000Fh (10) data
; data first byte: 0.1.0.Y12.Y11.Y10.Y9.Y8
; data second byte: 0.1.1.0.Y2.Y1.X2.X1
; data third byte: 0.1.1.Y7.Y6.Y5.Y4.Y3
; data fourth byte: 0.1.0.X12.X11.X10.X9.X8
; data fifth byte: 0.1.0.X8.X7.X6.X5.X4.X3
      C=DAT1.A
      CD1EX
      LA(5) DOCSTR
      DAT1=A.A ; overwrite the prolog so it turns from DOHXS to DOCSTR
      D1=C
      A=DAT0.A
      D0+5
      PC=(A)
*end
      C=RSTK
      D0=C
      D1=80100
      LC(5) end-begin
      GOSBVL MOVEDOWN
      GOVLNG 80100
    ENDCODE
  ;
;
I should probably point out that the Saturn conversion is mostly academical at this point. The performance improvement is marginal compared to the CPU cycle budget of the calculation our parameters come out of. Just think of what it takes to perform a "simple" addition of two real numbers... and then putting the result of that onto the stack for us to consume, complete with memory allocation and its hazards. Still, it's fun to mess around with this. Smile
Find all posts by this user
Quote this message in a reply
Post Reply 


Messages In This Thread
RE: Speed of HP-49G Routine for Tektronix Vector Graphics Terminal - 3298 - 08-06-2020 02:13 PM



User(s) browsing this thread: 1 Guest(s)