Threaded Mode | Linear Mode

EdS2 · 02-02-2023, 08:56 AM

(02-01-2023 04:25 PM)brouhaha Wrote:
(02-01-2023 09:08 AM)EdS2 Wrote: I'm really surprised to be seeing support for (3) despite the breaking of compatibility...

I prefer 3, even though it's not entirely compatible with the 16C, because
...

Thanks Eric, that's helpfui.

brouhaha · 03-02-2023, 04:52 AM

My memory expansion microcode for the 16C is now working on the DM16L, with the microcode inserted into the SwissMicros DM16_32 firmware. I had to reduce the expanded RAM size slightly compared to what I've got in Nonpareil, because the SwissMicros ARM code implements hardware registers for the 15C's second R2D2 chip, whereas in Nonpareil I only do that for the 15C, and not other Voyager models.

After clearing the memory (WSIZE=16), f MEM reports "P-0 R-0773". If I set WSIZE=4, f MEM reports "P-0 R-3094". The microcode is patched to handle that many registers, and display four digits for the register count. It should now allow 999 step programs; I've tested that in Nonpareil, but not yet on the DM16L.

I was pleased to see that the SwissMicros serial console "s" command to dump memory actually works, so apparently that dumps all non-zero hardware registers from 0x00 through 0xff. Presumably the "l" command will support those as well, but I haven't yet tried it.

I've been thinking about expandnig the 15C and 16C memory even further, but that would probably NOT work with the SwissMicros firmware, because I doubt that SwissMicros supports RAM above 0xff. (To be fair, I haven't actually tried.). I'd have to write my own ARM code, which I've been considering doing anyhow, but if I did that I probably wouldn't support multiple fonts, the RTC, and other nifty features that SwissMicros provides.

Garry Lancaster · 03-03-2023, 06:44 PM

(03-02-2023 04:52 AM)brouhaha Wrote: My memory expansion microcode for the 16C is now working on the DM16L, with the microcode inserted into the SwissMicros DM16_32 firmware.

Nice progress. Was it easy to find the ROM image within the DM firmware?

Quote:I was pleased to see that the SwissMicros serial console "s" command to dump memory actually works, so apparently that dumps all non-zero hardware registers from 0x00 through 0xff. Presumably the "l" command will support those as well, but I haven't yet tried it.

That's good. The dump/restore facility is very useful, and will be more important with a larger memory space for programs.

Quote:I've been thinking about expandnig the 15C and 16C memory even further, but that would probably NOT work with the SwissMicros firmware, because I doubt that SwissMicros supports RAM above 0xff. (To be fair, I haven't actually tried.). I'd have to write my own ARM code, which I've been considering doing anyhow, but if I did that I probably wouldn't support multiple fonts, the RTC, and other nifty features that SwissMicros provides.

Seems to make more sense to leverage the facilities the SwissMicros firmware provides rather than rolling your own. 1547 available bytes seems plenty for 16C usage, although maybe 15C users are more demanding.

brouhaha · 03-03-2023, 10:33 PM

(03-03-2023 06:44 PM)Garry Lancaster Wrote: Was it easy to find the ROM image within the DM firmware?

In terms of difficulty level, I'd say it was moderately hard, but having the right tools helps tremendously.

In terms of time, with the right tools I was able to figure out what needed to be done in about 45 minutes. Actually writing custom Python scripts to deal with it, and in the process validate my analysis, took about two hours. Then, since I had to insert a bigger microcode ROM image into the DM16 firmware than it was designed to accomodate, I spent roughly four hours writing more scripts, trying several approaches, and debugging. About another half hour was spent testing the registers and finding out that one low-numbered register wasn't working correctly, but a different register depending on word size. Figuring out that the problem was caused by using memory that would be the second R2D2 in the 15C took perhaps 15 minutes, and fixing it took another 15 minues (mostly updating comments in my code; the actual code change was just one line).

I'm not sure why they've made it difficult to access the microcode image, but it's clear that it was done deliberately. I don't think I should make the details public, because that might lead to it being made even more difficult in the future, although that's already a possibility now that I've revealed that I've figured it out.

Quote:
Quote:I've been thinking about expandnig the 15C and 16C memory even further, but that would probably NOT work with the SwissMicros firmware, because I doubt that SwissMicros supports RAM above 0xff. (To be fair, I haven't actually tried.). I'd have to write my own ARM code, which I've been considering doing anyhow, but if I did that I probably wouldn't support multiple fonts, the RTC, and other nifty features that SwissMicros provides.

Seems to make more sense to leverage the facilities the SwissMicros firmware provides rather than rolling your own. 1547 available bytes seems plenty for 16C usage, although maybe 15C users are more demanding.

If you want more labels on the 16C, there's no choice but to increase the memory. There are only 13 unused codes for program steps, whereas I need 720 additional codes for LBL, GTO, and GSB with extended labels. The size of a program step has to get bigger, and for various reasons it's only practical to extend it from 8 bits to 16 bits. That means that without expanding the memory, you can have either:

1) 999 program steps and 548 bytes of registers, but no new labels

2) a smaller number of program steps (perhaps up to 884 but with NO registers), but with more labels.

With memory expansion (beyond what the SwissMicros ARM code is known to support), you could have 999 steps with extra labels, and 1792 bytes of registers, simultaneously. Those sizes would always be available, with no moveable partition boundary.

Garry Lancaster · 03-03-2023, 11:55 PM

(03-03-2023 10:33 PM)brouhaha Wrote: I'm not sure why they've made it difficult to access the microcode image, but it's clear that it was done deliberately. I don't think I should make the details public, because that might lead to it being made even more difficult in the future, although that's already a possibility now that I've revealed that I've figured it out.

Interesting, and somewhat odd considering they seem to welcome alternative firmwares running on their DMCP platform. Still, good that you were able to figure it out.

Quote:If you want more labels on the 16C, there's no choice but to increase the memory. There are only 13 unused codes for program steps, whereas I need 720 additional codes for LBL, GTO, and GSB with extended labels. The size of a program step has to get bigger, and for various reasons it's only practical to extend it from 8 bits to 16 bits.

I assume this means you plan to make all steps 2 bytes in size? Is it feasible to keep the existing 1-byte codes, taking 3 unused ones for extended LBL/GTO/GSB, with just those requiring an additional byte for the label id?

This would presumably require reimplementing GTO . nnn to always search from the start of the program, if it's currently done by straightforward calculation. And BST to effectively do a GTO . (current step - 1). But as these are interactive, performance isn't too important.

Maybe 15C already does something similar, as I understand it has a mixture of 1- and 2-byte step encodings.

Quote:2) a smaller number of program steps (perhaps up to 884 but with NO registers), but with more labels.

With memory expansion (beyond what the SwissMicros ARM code is known to support), you could have 999 steps with extra labels, and 1792 bytes of registers, simultaneously. Those sizes would always be available, with no moveable partition boundary.

884 steps with no registers is still a massive improvement on what we currently have. But more would of course be welcome!

Paul Dale · 03-04-2023, 04:43 AM

I would 100% support these expansions.

Two-bytes vs one-byte opcodes? I'm not concerned. Move most of the one-byte codes into the 0-127 space. Put the rest in the 128-255 space. That's just a suggestion -- I honestly don't care so long as more capability is uncovered.

Alternatively, all op-codes are sixteen bits wide. Problem solved.

With modern ICs memory is simple not an issue.

brouhaha · 03-05-2023, 01:37 AM

(03-03-2023 11:55 PM)Garry Lancaster Wrote: I assume this means you plan to make all steps 2 bytes in size? Is it feasible to keep the existing 1-byte codes, taking 3 unused ones for extended LBL/GTO/GSB, with just those requiring an additional byte for the label id?

This would presumably require reimplementing GTO . nnn to always search from the start of the program, if it's currently done by straightforward calculation. And BST to effectively do a GTO . (current step - 1). But as these are interactive, performance isn't too important.

Maybe 15C already does something similar, as I understand it has a mixture of 1- and 2-byte step encodings.

The 15C microcode goes to absolutely HUGE lengths to make variable-length instructions work. (And there are bugs in it on both the 15C and 41C.) While it's obviously possible to hack the 16C to do that, it would be an enormous task.

Also, going to two-byte instructions "in-place" where the one-byte instructions are stored now is also too much work. Too much of the microcode is dependent on the way the instructions are stored.

The "easy" approaches are to keep one byte where it is, and store the second byte in a parallel structure. There are two possibilities for that parallel structure:

1) keep it somewhere in the main RAM (0x10..0xff). For instance, it could be directly under the current program. This is a bit of a pain, but easier than the variable-length or two-byte-in-place options. Every time the program grows or shrinks by what would normally be one register, the ENTIRE second-byte array would have to also shift by an entire register, in addition to its own size change. The only benefit of this arrangement is that it will work even in a host environment that only implements RAM up to 0xff.

2) store the second bytes somewhere above 0xff - this is by far the easiest approach, and what I have working in an experimental build of Nonpareil. It results in 999 program steps (two-byte), and 1792 bytes of registers, with no curtain. The drawback is that it doesn't appear to work with Swissmicros ARM code.

When I started hacking the 16C microcode, I intended to run it on Swissmicros hardware, but I didn't really expect to be able to use their ARM code, for both technical and licensing reasons. I could work around the licensing, by not supplying their code, only a program to patch it.

brouhaha · 03-05-2023, 01:58 AM

(03-04-2023 04:43 AM)Paul Dale Wrote: Two-bytes vs one-byte opcodes? I'm not concerned. Move most of the one-byte codes into the 0-127 space. Put the rest in the 128-255 space.

I don't think I fully understand.

On the 15C, they use variable length instructions. They needed to be able to reliably distinguish any given byte in the program as being:
1) a single-byte instruction
2) the first byte of a two-byte instruction
3) the second byte of a two-byte instruction

They accomplished that by not allowing any single-byte instructions, or any second bytes, from having a 0xf in the least signficant digits, and requiring that all first bytes have 0xf in the least significant digit.

By making the program encoding unambiguous (never requiring looking back more than one byte), they make BST and label searches very fast.

The HP-41C has much more complicated rules for instruction length, and arbitrary bytes in memory are ambiguous, so BST and label search are expensive. They make up for the label search cost by inserting the target of a local GTO and XEQ instruction into that instruction the first time the search for the target is done, and then they have to do a "decompile" pass every time the program gets edited. (The 41 also runs the Nut CPU at nearly twice the speed of a Voyager.)

Quote:Alternatively, all op-codes are sixteen bits wide. Problem solved.

The easiest solution, and the one I want to use. The "extra byte" for all current instructions will be zero. If it's non-zero, then it's a new instruction.

The trick is where to store the second byte.

Quote:With modern ICs memory is simple not an issue.

Sure, but I'm not trying to write a 16C-equivalent calculator from scratch. (An argument could easily be made that I should.) I'm trying to hack up HP's 16C microcode. If I stay within certain bounds, such as only using registers from 0x10 through 0xff, then it will run on existing simulation code on existing ARM-based calculators from both HP/Moravia/Royal and SwissMicros. If I write my own simulator (which as everyone here knows, I've done before), then I could run it on pretty much any hardware, but I'd have to write specific code for each hardware platform to deal with its keyboard, display, etc. Even that's not a huge deal, except that on SwissMicros, I'd be giving up the fonts, real-time-clock, and other features they have, unless I went to the trouble of writing equivalents to all that myself. I did that on the series of prototypes that Richard Ottosen and I built; we had two prototypes that had functionality similar to the DM41X and the DM42. But in the limited time I can devote to this, I enjoy hacking the HP microcode more than I enjoy dealing with new ARM hardware.

As with everything, it's a tradeoff, and there's no correct answer.

Paul Dale · 03-05-2023, 06:08 AM

My thought was opcodes 0-127 are single byte. Two byte opcodes consist of two 128-255 bytes. That gives 128 short opcodes and 16384 long opcodes which should be more than enough.

Garry Lancaster · 03-05-2023, 07:35 PM

(03-05-2023 01:37 AM)brouhaha Wrote: The 15C microcode goes to absolutely HUGE lengths to make variable-length instructions work. (And there are bugs in it on both the 15C and 41C.) While it's obviously possible to hack the 16C to do that, it would be an enormous task.

Also, going to two-byte instructions "in-place" where the one-byte instructions are stored now is also too much work. Too much of the microcode is dependent on the way the instructions are stored.

The "easy" approaches are to keep one byte where it is, and store the second byte in a parallel structure.
....

Very interesting indeed - thanks for all the insights. A parallel structure for the second byte sounds like quite a clever solution.

brouhaha · 03-12-2023, 08:32 AM

I think I've now got a handle on adding float mode operations to the normally integer-only instructions as Garry suggested. I need to do more testing to make sure I haven't broken the integer operations.

I found it interesting that the 16C microcode does such an effective job of making those instructions act as no-ops in float mode that they don't even terminate digit entry or affect the stack lift status.

Garry Lancaster · 03-12-2023, 09:36 AM

(03-12-2023 08:32 AM)brouhaha Wrote: I think I've now got a handle on adding float mode operations to the normally integer-only instructions as Garry suggested. I need to do more testing to make sure I haven't broken the integer operations.

I found it interesting that the 16C microcode does such an effective job of making those instructions act as no-ops in float mode that they don't even terminate digit entry or affect the stack lift status.

Excellent news Smile

brouhaha · (This post was last modified: 04-05-2023 05:38 PM by brouhaha.)

(03-12-2023 08:32 AM)brouhaha Wrote: I think I've now got a handle on adding float mode operations to the normally integer-only instructions as Garry suggested. I need to do more testing to make sure I haven't broken the integer operations.

Now I'm back to 16C microcode hacking after the side quest on the LCD segment to hex program. The purpose of that was to save me perhaps 10 minutes of time writing the code that does this (shown by my Nonpareil simulator):

Or shown by running the same hacked firmware on a SwissMicros DM16L:

What you see there is that I've patched the microcode that decides whether a function can be executed in FLOAT mode. The calculator is in FLOAT mode, and I've pressed "f SL, which is normally completely ignored in FLOAT. My patch got control, and displayed "FL OP" and a three digit hexadecimal code identifying the specific function, in this case "15A". (This isn't the actual program opcode for SL; when executing from the keyboard, the opcode is not available by the time my patch exeuctes.)

So instead of spending ten minutes figuring out which bits to set in the LCD control registers to display "FL OP", I spent two weeks of spare time writing a program to do it for me! Here's what that program shows, including the 15C user code instructions that could be used to generate the same display on a 15C:

Unlike the "FL OP", the digits "15A" are decoded using the 16C's internal hexadecimal-to-seven-segment routine.

An interesting thing to note is that the standard 16C microcode never displays "L", and yet the SwissMicros ARM firmware recognized them and converted them into its internal character set (maybe ASCII?), and then displays them correctly in its bitmap font. I suspect that this happens because SwissMicros made synthetic programing on the 15C work, where such displays can be generated.

brouhaha · (This post was last modified: 04-06-2023 05:00 AM by brouhaha.)

The first transcendental functions I've tried to add to the 16C are ln, e^x, log (base 10), and 10^x. I have ln actually working, as long as digit entry has been terminated before invoking it. e^x kind of works, but usually gives results 10x what it should be (floating point exponent high by 1), and other times crashes or returns gibberish.

log and 10^x aren't yet working. They're giving the ln and e^x results instead.

Anyhow, I was pretty amazed to be able to compute ln(8) = 2.079441542 on a (hacked) 16C. One small step for a man, one infinitesimally small leap for mankind.

One of the possible issues is that the code uses status register bits 4, 5, 6, 7, and 8, and this may interfere with the 16C's normal use of those status bits. I need to find somewhere to save the status bits before executing the new functions, and restoring them afterward.

Garry Lancaster · 04-06-2023, 07:18 AM

Sounds like great progress. Are you adapting the code from the 15C, or 11C?

Do you have to do any further work to get these new functions accessible in a program, or does it all just fall out in the wash if float mode is set and the appropriate key code is encountered? I suppose what I'm probably asking here is whether the float function has the same instruction code as the integer function in the same location.

brouhaha · 04-06-2023, 07:45 AM

(04-06-2023 07:18 AM)Garry Lancaster Wrote: Sounds like great progress. Are you adapting the code from the 15C, or 11C?

I'm using code from the 41C and 10C. The transcendental code is pretty similar in all models in the late Woodstock/Topcat (67/97, 19C/29C), Spice, and Voyager series.

Quote:Do you have to do any further work to get these new functions accessible in a program, or does it all just fall out in the wash if float mode is set and the appropriate key code is encountered? I suppose what I'm probably asking here is whether the float function has the same instruction code as the integer function in the same location.

Yes, the same instruction code. Right now I'm using SL (f A) as LN in float mode (though that will probably change), so in a program there's no way to tell them apart unless you know whether it will be in float or integer mode when executed.

brouhaha · 04-07-2023, 10:23 AM

(04-06-2023 04:53 AM)brouhaha Wrote: One of the possible issues is that the code uses status register bits 4, 5, 6, 7, and 8, and this may interfere with the 16C's normal use of those status bits. I need to find somewhere to save the status bits before executing the new functions, and restoring them afterward.

I switched the status bit usage from what was in the 10C to that of the 41C, in particular to not use status bit 8, so that I can save and restore status bits 0-7. That does seem to have cleared up quite a few issues. e^x and 10^x are still returning 10x the proper value. I probably need to compare traces of e^x operation on the hacked 16C microcode vs the 41C.

I started adding the 41C y^x code also, but I haven't got it hooked up to a 16C operation yet. I need to figure out how best to handle two-operand instructions, for y^x, P->R, and R->P. I haven't even started merging and debugging the trig code yet, so P->R and R->P are still some ways off, but y^x should be easy to get working once e^x is working.

brouhaha · 04-08-2023, 06:53 AM

I've fixed a few dumb mistakes that led to wrong values from the exponential functions. As far as I've seen so far, the hacked 16C microcode now gets correct results for logarithmic functions, and exponential functions when the operand doesn't cause overflow. When the exponential overflow, instead of getting 9.999999999e99, they get an NNN with an exponent greater than 99, which is displayed improperly, e.g. 1e100 displays as 1.0e-00. I need to add overflow detection; I'm going to look at how the 41C and 11C do that. On the 16C, for consistency an exponential overflow should set flag 5 (G).

The logs and exponentials are saving x in LASTx, except if number entry has not been terminated prior to the function being invoked. In other words, the sequence "4 x<>y x<>y ln" puts 4.0 in LASTx, but "4 ln" puts an NNN from the non-terminated digit entry into LASTx. I need to study how other 16C floating point functions (arithmetic, 1/x, sqrt) deal with LASTx. It l9oks like the default LASTx handling is only suited for integer functions.

I'm gradually learning a lot more details about 16C microcode internals.

brouhaha · 04-09-2023, 12:15 AM

(04-08-2023 06:53 AM)brouhaha Wrote: The logs and exponentials are saving x in LASTx, except if number entry has not been terminated prior to the function being invoked. In other words, the sequence "4 x<>y x<>y ln" puts 4.0 in LASTx, but "4 ln" puts an NNN from the non-terminated digit entry into LASTx. I need to study how other 16C floating point functions (arithmetic, 1/x, sqrt) deal with LASTx. It looks like the default LASTx handling is only suited for integer functions.

The LASTx problem was another thinko on my part. I understood how the default LASTx handling works, but didn't notice that it occurs AFTER the point that I patch to implement the new functions. I just had to copy about eight instructions from the normal path into my alternate path.

I still have to add overflow handling, which will probably not be as simple.

brouhaha · 04-09-2023, 06:54 AM

Overflow handling wasn't terribly difficult. I've resolved all of the known bugs in the hacked 16C firmware. I'd be surprised if there weren't more bugs hiding in there. I wish I had a copy of HP's regression test suite.

HP's e^x routine is deliberately written to support a range up to 10^200. Beyond that it saturates at 10^200. While 10^100 is the overflow point for final user results, I think the 10^200 limit in the actual routine was to allow for extended range in intermediate results in other microcode functions that use e^x as a subroutine.