FORTH for the PC-G850(V)(S)
|
11-06-2022, 01:56 PM
(This post was last modified: 05-21-2023 12:41 PM by robve.)
Post: #1
|
|||
|
|||
FORTH for the PC-G850(V)(S)
Recently started a new Forth project. This time for the Sharp PC-850(V)(S). Perhaps this is interesting to some of you.
Repository: https://github.com/Robert-van-Engelen/Forth850 It's about 8K in size plus 8K free space or more, written from scratch in Z80 assembly and Forth itself. It's an ANS/standard compliant, directly threaded code (DTC) Forth implementation. Words are case insensitive. No file IO and no floating point yet (no syscalls are published as far as I know), but may look into floating point later, e.g. to use a Z80 32 bit float library, if it fits. (EDIT: floating point is included with the latest version). Why write from scratch? Well, firstly well-known Z80 Forth implementations like CamelForth, eForth and Jupiter Ace aren't ANS Forth compliant and lack features. Secondly, it's more fun to write from scratch. The Moving Forth article was helpful, but I've decided to use DE as TOS, BC as IP (instruction pointer) and RP (return stack pointer) in memory. It's faster that way, since DE (TOS) and HL (scratch) are exchangeable in only 4 cycles and HL is a versatile register pair. Even faster would be to implement subroutine threaded code (STC) to execute Z80 code directly without DTC fetch+execute, but that will require 50% more space on average for Forth programs. Nice for speed, but perhaps not so much for usability on a small 32K machine. With a bit of time tinkering, I came up with efficient Z80 code for Forth words, more efficient in some cases than CamelForth, eForth and Jupiter Ace (the latter isn't particularly fast by the way). If someone can point to more efficient Z80 Forth DTC implementations, then please let me know. For example, my signed 16x16->16 multiplication takes only 51 cycles per bit versus 62 cycles per bit published by Zilog. I'm using the ASZ80 macro assembler with the CODE macro defining a Forth word "*" in machine code with label "star": Code: CODE *,star EDIT: even faster multiplication is possible. To calculate the high order byte we do not need to iterate over all 8 bits of the high order multiplier stored in register c, but only over the nonzero bits. We also can ignore the lower order result stored in register e. This reduces the max loop iteration cycle time to 32 and 33 per bit. Furthermore, the second loop only runs until the last bit of register c is shifted out. If register c is zero, the second loop does not execute thereby saving hundreds of cycles. We also use jp instead of jr to improve and balance the cycle time per bit: Code: CODE *,star At the heart of Forth parsing are my new CHOP and TRIM words in assembly that exploit special Z80 ops, which I found to be better suitable and more efficient than CamelForth's approach to parsing. The CHOP and TRIM words: Code: ; CHOP c-addr u1 char -- c-addr u2 My case-insensitive dictionary search FIND-WORD should speed up Forth compilation by quite a bit: Code: ; FIND-WORD c-addr u -- c-addr 0 | xt 1 | xt -1 The Z80 is alright for Forth (I've used the Z80 back in the 80s and loved it), but not as good as the 6809 and the ESR-L (PC-E500) that have two stack registers. What a luxury compared to the Z80! - Rob "I count on old friends to remain rational" |
|||
11-08-2022, 09:23 PM
Post: #2
|
|||
|
|||
RE: FORTH for the PC-G850(V)(S)
Currently testing the Forth850 for Forth standard compliance and performance.
The NQUEENS benchmark runs in 0.865 seconds. This is measured in "fast mode", looping NQUEENS 100 times with a stopwatch. In "slow mode", the NQUEENS benchmark runs in a little longer in 0.95 seconds. This is the same Forth NQUEENS program as benchmarked on the PC-E500. The PC-G850 runs at 8MHz, whereas the PC-E500 runs at 2.3MHz with NQUEENS taking 3.47 seconds. Since 8/2.3 = 3.48, I am happy to see that the performance of these implementations are comparable when adjusting for clock speed. The Z80 runs slightly faster, but that is attributable to the fact the the PC-E500 has a 20 bit address space, which requires 16<->20 bit conversions in Forth when dealing with addresses stored 16 bit Forth cells. If it matters, the "slow mode" version of Forth850 uses ~300 bytes less of the 8K Forth core, which doesn't seem critical to me to save space. So the "fast" version is probably the best way to go to release once testing is completed. Both slow and fast versions perform stack over/underflow detection and BREAK key interrupt detection. Without these, NQUEENS would run in 0.84 seconds. But that version won't be as usable to play with since it will crash when the stack over/underflows. Will release Forth850 soon when testing is completed, so anyone can try it out and rerun the benchmark. - Rob "I count on old friends to remain rational" |
|||
11-09-2022, 07:45 PM
Post: #3
|
|||
|
|||
RE: FORTH for the PC-G850(V)(S)
This is incredible, thank you. It would be amazing to see a new generation of these kinds of machines, hopefully projects like yours will drum up more interest and eventually we see something maybe crowd funded.
|
|||
11-10-2022, 09:36 PM
Post: #4
|
|||
|
|||
RE: FORTH for the PC-G850(V)(S)
Forth850 is now available: https://github.com/Robert-van-Engelen/Forth850
It has 294 words in an 8K executable, including 2xx and Dxx words for double cells, 256 byte string buffers PAD, TIB and TMP, string operations, exceptions THROW and CATCH, VALUE, 2VALUE, FORGET, VOCABULARY, simple graphics with DRAW and VIEW, port IO with OUT and INP. It's super fast by design and written from scratch in Z80 assembly (see the README in the repository for details). Not only does the nqueens benchmark run in only 0.865 seconds, compiling Forth source is speedy too without noticeable delays. A nice feature of Forth850 is that the size of the free dictionary space is user-definable. This is done by setting USERxxxx in the Monitor to the size of the Forth system. At minimum the system requires 9K e.g. USER23FF to accommodate the 8K executable with minimal stacks and dictionary space. The largest is USER75FF with about 21K of free dictionary space. I've included a forth850.wav file to load the Forth 850 binary via a cassette interface. I will look into serial later. I need to acquire a serial cable first. Things that will be nice to add (in progress):
- Rob "I count on old friends to remain rational" |
|||
11-11-2022, 01:55 AM
Post: #5
|
|||
|
|||
RE: FORTH for the PC-G850(V)(S)
(11-10-2022 09:36 PM)robve Wrote: It's super fast by design and written from scratch in Z80 assembly Indeed! 5 times faster than C is really impressive. The only more efficient language on a pocket computer is the Pascal compiler of the PB-2000C with 1.27 seconds at 0.91 MHz or clock speed adjusted 0.144 seconds at 8.0 MHz. Thank you for the test. Calculator Benchmark |
|||
11-12-2022, 12:28 AM
(This post was last modified: 11-12-2022 06:55 PM by robve.)
Post: #6
|
|||
|
|||
RE: FORTH for the PC-G850(V)(S)
Well, nothing can or will always go 100% according to plan! During a last cleanup effort of the source code to post on the repository, I accidentally copied a DROP line. As a result, entering a Forth command produced <ERR-13>.
I was pulling my hair out to figure this one out. I had spent three days to carefully test everything during my free hours before I ran into this today to test an addition of new GETKEY word. It's fixed now The PC-G850 Monitor helped me a lot to check my code and set breakpoints. The PC-G850 is a great little machine! Does anyone have suggestions for a small Z80 single precision (32 bit) floating point library? BCD or IEEE754 or something else? - Rob "I count on old friends to remain rational" |
|||
11-12-2022, 06:26 PM
(This post was last modified: 11-12-2022 06:55 PM by robve.)
Post: #7
|
|||
|
|||
RE: FORTH for the PC-G850(V)(S)
With the DRAW word you can create new characters to display with text.
For example, define GREMIT (graphic emit) to display 8x6 pixel characters: Code: : GREMIT To display \( \alpha \) and \( \beta \) we just need to define their pixels, which is simpler in binary with %: Code: GREMIT alpha So ." result " alpha ." =" beta displays result α=β The VIEW word extracts pixel patterns. With this word you can move pixels and OR- or XOR-pixels to DRAW. The cursor's code uses VIEW and DRAW to invert the cursor on screen. KEY displays the cursor. I've written a version of EMIT to support control characters, which the PC-G850 does not support natively. Lots of possibilities to enhance Forth programs. - Rob "I count on old friends to remain rational" |
|||
11-13-2022, 10:15 PM
(This post was last modified: 05-21-2023 12:47 PM by robve.)
Post: #8
|
|||
|
|||
RE: FORTH for the PC-G850(V)(S)
Forth850 can be extended "on the fly" with your own Z80 machine code using the PC-G850(V)(S) built-in Z80 Assembler.
To clarify, here are the steps. A BEEP example follows below.
Example BEEP word. In Forth850: Code: NFA, BEEP Code: ORG 20FA Assemble the code, which is 18 bytes long. Return to Forth850 with BASIC and CALL256 and enter: Code: 18 ALLOT Important: run the Assembler again to save the code in the ALLOTed space. Now BEEP works in Forth850. Alternatively, the BEEP word can also be defined with HEX codes in Forth850 as follows, which takes more effort but with the same result: Code: NFA, BEEP ( -- ) A simple example that flips the bytes of the TOS stored in register DE, which takes 5 bytes of machine code: Code: NFA, FLIP Code: ORG xxxxH As always when writing assembly, if something is seriously wrong with it then we may crash and have to start over. In the worst case we have to install Forth850 again when the dictionary is damaged by "random POKEs". However, it is often not necessary to reset the machine when asked for MEMORY CLEAR (Y/N) just say N (NO) and give it another try after fixing the problem. - Rob "I count on old friends to remain rational" |
|||
11-14-2022, 07:35 PM
Post: #9
|
|||
|
|||
RE: FORTH for the PC-G850(V)(S)
A really nice thing about the PC-G850(V)(S) is that the ROM and RAM addresses are known and included in the excellent PC-G850V(S) User Guide translated by Jack W. Hu.
So let's go forth and put our "Pokecon" to some good use by defining a new Forth850 word TEXT that reads the TEXT editor area to evaluate Forth. In this way we can write Forth code in the built-in TEXT editor and then read it back into Forth:
Code: : TEXT A minor caveat of using EVALUATE for each line: .( and ( cannot span multiple lines, i.e. a ) must be on the same line. - Rob "I count on old friends to remain rational" |
|||
11-29-2022, 07:56 PM
Post: #10
|
|||
|
|||
RE: FORTH for the PC-G850(V)(S)
I've written new IEEE 754 floating point routines from scratch in Z80 for Forth850. I wasn't happy with the Z80 floating point libraries I found on GitHub and elsewhere, which appear to have sizable code bases and are heavy on memory access and use. The new Z80 floating point routines I wrote operate with Z80 registers only, including the shadow registers, except for a one push-pop pair to pass a value from a shadow register pair to a register pair. The shadow registers are available for use with the PC-G850 (i.e. not used by interrupts). So let's put them to work. The new math.asm Z80 library:
I may extend the library to support INF/NAN and banker's rounding to make it compliant with IEEE 754. Right now, the only observable difference is the rounding (lack thereof). Consistency of the numerical treatment is important. The Forth code for the Sine plot shown in the picture: Code: 3.14159265E0 2CONSTANT PI Code: : FSIN ( r1 -- r2 ) \ only valid for -2pi <= r1 <= 2pi I've updated the forth850-full version with the new features. The binary is about 10K, so there is plenty of space left on the machine. I've tested the implementation extensively and will continue to do so in the near future. - Rob "I count on old friends to remain rational" |
|||
12-07-2022, 03:07 AM
Post: #11
|
|||
|
|||
RE: FORTH for the PC-G850(V)(S)
(11-29-2022 07:56 PM)robve Wrote: I may extend the library to support INF/NAN and banker's rounding to make it compliant with IEEE 754. Right now, the only observable difference is the rounding (lack thereof). I rewrote part of the Z80 IEEE 754 math core today to add two proper IEEE 754 rounding modes to choose from (at compile time):
Adding INF/NAN is should actually be easy with some logic to test these special cases. I could make it an option. But I prefer to have exceptions in Forth to catch floating point overflow, division by zero etc. so signaling NaN and INF exceptions. With internal proper round to nearest, ties to even the new sine plot is perfectly symmetric on close inspection: A bit surprised myself that I was able to pull this off, as I was already using all Z80 registers - Rob "I count on old friends to remain rational" |
|||
01-01-2023, 05:45 PM
Post: #12
|
|||
|
|||
RE: FORTH for the PC-G850(V)(S)
Happy to share that Forth850 v1.0 is now available on GitHub.
The basic 8K version has 295 words. The more complete 11K version has 349 words, including IEEE 754 single precision and a more capable line editor (like Forth500). Both versions were extensively tested and verified *). The performance has not changed since the initial release. The n-queens benchmark is solved in 0.865 seconds. I wrote three versions of a new IEEE 754 single precision math library with standard floating point operations in Z80 assembly solely using Z80 registers, i.e. "memoryless" **). All trig and other math functions are defined in Forth in MATH.FTH. - math.asm: a truncating version that is small, 960 bytes of code; - mathr.asm: a proper rounding version (IEEE 754 round to nearest, ties to even) 1085 bytes; - mathri.asm: like mathr.asm extended with IEEE INF and NAN (Forth850 does not use INF/NAN) and signed zero, but no subnormals (which are evil), 1286 bytes. It felt quite satisfying to put this all together, as everything fits in Z80 registers, including shadow, but not IX and IY which remain unused. Some effective use of "Z80 tricks" I learned in the 80s - Rob *) while this version was extensively tested and verified, there is no guarantee that there are zero bugs. Please report issues so I can fix them. **) "memoryless" but a PUSH+POP must be used to copy to Z80 shadow registers, so there is at most one PUSH+POP pair per flop (add, sub, mul, div). "I count on old friends to remain rational" |
|||
01-03-2023, 03:32 AM
(This post was last modified: 01-03-2023 03:40 AM by F-73P.)
Post: #13
|
|||
|
|||
RE: FORTH for the PC-G850(V)(S)
Great work!
Did you write routines to convert between strings and floats? I've started writing string-to-double and double-to-string routines and it's quite challenging, requiring extended-precision arithmetic and tables. The C language combines all the power of assembly language with all the ease-of-use of assembly language |
|||
01-03-2023, 06:57 PM
(This post was last modified: 01-04-2023 02:30 PM by robve.)
Post: #14
|
|||
|
|||
RE: FORTH for the PC-G850(V)(S)
(01-03-2023 03:32 AM)F-73P Wrote: Did you write routines to convert between strings and floats? I've started writing string-to-double and double-to-string routines and it's quite challenging, requiring extended-precision arithmetic and tables. Yes. For single precision floating point, that is. I evaluated three strategies with tables. Because single precision floating point decimal exponents are limited to the +/-38 range, a full powers of 10 table is logical to use but an overkill. I found that a table with 12 entries saves memory and is sufficiently accurate, almost as accurate as a full table. I've documented this in the mathr.asm assembly source and also in mathri.asm (same, but with inf/nan and signed zero) see the fpow10 routine: Code: ; method min bits mean bits where "min bits" means the minimal number of bits guaranteed to be accurate -log(d)/log(2.0) with minimum d of relative errors e between x and y = fabs(x - y)/fabs(y), and "mean bits" means the number of bits accurate on average (relative error) in 10 million random samples: -log(a/SAMPLES)/log(2.0) with a the sum of relative errors. The mantissa has 24 bits. Note that summing relative errors can get over 25 "mean bits", which is when more cases are exact than have errors. I ran the tests in C code (DM me if you want this code). The method selected (table of 12) is enabled with .if-.endif, while the others are disabled, but could be used. I would believe that with double precision floating point you could use a small table, perhaps as small as 12 but probably larger. Exponentiation by squaring is another option to keep the table small. See my assembly source code for the Z80 routines. As you also point out, the powers of 10 tables are used by atof (string to float) and ftoa (float to string). The mathr.asm ftoa routine computes ((exponent - bias - 1) * 77 + 77 + 82) / 256 to estimate the decimal exponent given the biased binary exponent. I've seen this elsewhere, but I made an improvement to adjust with +82 to prevent underestimation. With double precision this will need work and more tweaking. In the atof routine I used a 32 bit mantissa to multiply by 10 to maintain accuracy and produce a 24 bit mantissa from a string of digits without loss or precision up to 31 digits, with rounding to nearest ties to even when over 10 digits or so. For double precision, this needs a 64 bit mantissa. Beware of pitfalls when parsing floats. Floats can be represented in many different ways, such as 1234567890E-45 and 0.0000000000123456789E45 for example. Rather obvious to look at, but the code should handle it. I wrote the math code in less than a week in the evenings, but testing took another week or two to make sure there are no bugs and to make some tweaks. For example, I found that the round to nearest ties to even with "sticky bits" didn't work correctly, so I came up with a better way to implement this mechanism without requiring much code or using Z80 registers which I can't, since they are already all in use. There is a lot more I could comment on. If you have questions, let me know in your reply or DM me. - Rob "I count on old friends to remain rational" |
|||
01-05-2023, 07:52 AM
(This post was last modified: 01-06-2023 06:42 AM by F-73P.)
Post: #15
|
|||
|
|||
RE: FORTH for the PC-G850(V)(S)
Brilliant!
I'm using a relatively simple but inefficient approach for converting strings to single-precision floating-point: use multiple-precision arithmetic and tables to find integers s,e such that s x 2^e = user input, where s element of [2^23,2^24) and is rounded up. Then use s,e to obtain the IEEE 754 single-precision encoding. Its working for the few values I've tested (0.5 -> 0x3F000000, 2 -> 0x40000000 and 2.5 -> 0x40200000). Once I've added higher powers of 10 to the table and parsing exponents I'll try smaller and bigger values. For floating-point to string I'll use the first algorithm in this paper. The C language combines all the power of assembly language with all the ease-of-use of assembly language |
|||
05-20-2023, 11:11 PM
(This post was last modified: 05-21-2023 01:48 AM by robve.)
Post: #16
|
|||
|
|||
RE: FORTH for the PC-G850(V)(S)
Forth850 v1.4 released
Forth850 GitHub repository Here's a summary of updates since the first stable release v1.0 (January 9, 2023): - v1.1 fix CASE OF not compiling - v1.2 new words UD/MOD, D/MOD, DMOD and D/; fix TO for 2VALUE - v1.3 permit floating point constants with no digits after E, such as 1E, as per Forth 2012 standard - v1.4 new word F>S added to the full version with floating point; improve LOGS.FTH and MATH.FTH particularly the F** word see below Exponentiation by squaring The F** algorithm's range and accuracy is improved in v1.4, see also the HP forum thread by J-F Garnier small challenge. The exponentiation-by-squaring conditions are tightened to ensure accuracy of the floating point result within 1 ULP, i.e. the least significant bit may be off in some cases. Exponentiation-by-squaring is fast and accurate for many integer F** calculations where the standard exp-log algorithm may produce a 1/2 ULP error, i.e. not necessarily return an integer when an integer result is expected. Code: : F** ( r1 r2 -- r3 ) Enjoy! - Rob "I count on old friends to remain rational" |
|||
07-30-2023, 10:53 AM
Post: #17
|
|||
|
|||
RE: FORTH for the PC-G850(V)(S)
Hi Rob. Please help me. I have a G850VS and I would love to study FORTH using your code. However I keep getting ERR-13 when entering any forth word. I noticed you had similar problems in the past. Any advice please?
PS no issues when loading other IHX apps. Thanks Iain |
|||
07-31-2023, 01:53 AM
(This post was last modified: 07-31-2023 04:37 PM by robve.)
Post: #18
|
|||
|
|||
RE: FORTH for the PC-G850(V)(S)
(07-30-2023 10:53 AM)iaincockburn Wrote: Hi Rob. Please help me. I have a G850VS and I would love to study FORTH using your code. However I keep getting ERR-13 when entering any forth word. I noticed you had similar problems in the past. Any advice please? Have you added any definitions or is this a problem that occurs right after loading Forth850? I assume you're using the most recent version of Forth850. I've re-tested the wav and ihx files in the repo. Before loading Forth850, please make sure to allocate sufficient free space in the Monitor first, see the README. The old error -13 problem was not a big issue, it was caused by a simple typo that I corrected within a day or so to upload a fix. - Rob "I count on old friends to remain rational" |
|||
08-06-2023, 10:11 AM
Post: #19
|
|||
|
|||
RE: FORTH for the PC-G850(V)(S)
Hi Rob - thanks again for your speedy response and my apologies for my tardiness.
Yes, I had read your Readme and set the USER to your recommendation and ran it with R100 as suggested (also tried G100). I also did a hard reset and cleared memory, all to no avail. I think I had loaded your latest version from GITHub (release 1.4). Please don't worry about it! |
|||
« Next Oldest | Next Newest »
|
User(s) browsing this thread: 6 Guest(s)