Threaded Mode | Linear Mode

robve · 10-10-2021, 01:38 AM

(10-08-2021 01:04 PM)robve Wrote: However, the dictionary search can be optimized, like most Forth implementations. For example, the HP-71b limits searching based on the word length, thus checks dictionary entries for words of the same length only. Other implementations use trees or hashing. There are also simple and practical ways to speed up dictionary search, which I will try. For starters, comparing the length and the first character simultaneously to check a dictionary entry will speed things up.

A quick update for those interested in this project, or in Forth, or in the E500's CPU.

The new FIND-WORD assembly code listed further below runs about twice as fast as the old FIND-WORD code (the version shown in the previous post). This means that case-insensitive dictionary searches in Forth500 should speed up quite a bit. Loading and compiling a Forth source file is largely determined by dictionary search speed.

The new CPU cycle stats compared to the old FIND-WORD, expressed in CPU cycles per word compared:

mismatching length: old = 54 cycles, new = 34 cycles
matching length but first characters differ: old = 108 cycles, new = 48 cycles
matching words, character-by-character comparison: old = 53 cycles, new = 43 cycles

The cost of a word length mismatch is 34 cycles. If the length matches, the cost of a first character mismatch is 48 cycles total (i.e. including the length match).

Assuming a directory size of 519 words (expected with Forth500), this means that a full dictionary search takes 23ms to 32ms or slightly longer, depending on the word being searched:
34x519/768KHz = 23ms
48x519/768KHz = 32ms

For example, an integer value 123 in the Forth source input matches the length of all 3-character words, but matches none of the words that start with a 1 thus taking 48x519 cycles to complete or 32ms. Explanation: all words, including integers, are first searched in the dictionary before pushed on the stack or compiled as an integer.

The new FIND-WORD assembly, annotated with CPU cycles (disclaimer: this may not be the final version):

Code:

find_word:      dw      to_body

                db      $09

                db      'FIND-WORD'             ; ( c-addr u -- 0 0 | xt 1 | xt -1 )

find_word_xt:   local

                mv      (!gl),a                 ; (gl) holds the string length (length < 64 checked next)

                mv      il,64                   ; Compare the string length

                sub     ba,i                    ; to the max of 63 characters

                popu    ba                      ; BA holds the string address

                pushu   x                       ; Save IP

                jrnc    lbl6                    ; String too long?

                mv      y,!base_address

                add     y,ba                    ; Y holds the string address

                mv      (!fl),[y++]             ; (fl) holds the first character of the string to search

                mv      (!yi),y                 ; (yi) holds the string address + 1

                mv      (!zi),y                 ; Set 2nd byte of (zi) to base address segment $b

                mvw     (!zi),[!last_xt+3]      ; (zi) holds the 20 bit LAST address

;               LOOP OVER DICTIONARY

lbl1:           mv      y,(!yi)         ; 5     ; Y holds the string address + 1

                mv      il,(!gl)        ; 4     ; IL holds the string length

                                        ; =9 cycles

;               NEXT WORD IN THE DICTIONARY

lbl2:           mv      x,(!zi)         ; 5     ; X holds the address of the dictionary entry

                or      (!zi),(!zi+1)   ; 6     ; Check if the address of the dictionary entry is zero

                jrz     lbl6            ; 2/3   ; Dictionary entry address is zero?

                mvw     (!zi),[x++]     ; 7     ; (zi) holds the previous dictionary link address

                mv      ba,[x++]        ; 5     ; A holds the word length and B holds the first character

;               COMPARE STRING LENGTHS

                sub     a,il            ; 3     ; Compare string lengths

                test    a,$7f           ; 3     ; Check string lengths, ignore immediate bit, keep smudge bit to force mismatch

                jrnz    lbl2            ; 2/3   ; String lengths are not the same?

                                        ; =33 cycles +1 for jump if the length does not match

;               COMPARE FIRST CHARACTERS

                ex      a,b             ; 3     ; B holds immediate bit to save for later, A holds first character

                xor     a,(!fl)         ; 4     ; Compare first characters

                jrz     lbl4            ; 2/3   ; First characters match?

                test    a,$df           ; 3     ; Check if case insensitive bits match

                jrnz    lbl2            ; 2/3   ; Case insensitive characters differ?

                                        ; =33+14=47 cycles +1 for jump if the length does not match and the first character did not match

                mv      a,(!fl)         ; 3     ; A holds the first character of the string to search

                or      a,$20           ; 3     ; Make it lower case (if A is a letter, checked next)

                cmp     a,'a'           ; 3

                jrc     lbl2            ; 2/3   ; A is not a letter?

                cmp     a,'{'           ; 3

                jrnc    lbl2            ; 2/3   ; A is not a letter?

                dec     il              ; 3     ; Decrement string length

                jrz     lbl5            ; 2/3   ; String length is zero?

                                        ; =47+22=69 cycles if the length matched and the first character matched

;               LOOP OVER STRINGS TO COMPARE

lbl3:           mv      a,[x++]         ; 4     ; A holds the next charater of the word

                mv      (!el),[y++]     ; 6     ; (el) holds the next character of the string to match

                xor     a,(!el)         ; 4     ; Compare characters

                jrz     lbl4            ; 2/3   ; Characters match?

                test    a,$df           ; 3     ; Check if case insensitive bits match

                jrnz    lbl1            ; 2/3   ; Case insensitive characters differ?

                mv      a,(!el)         ; 3     ; A holds the next character of the string to match

                or      a,$20           ; 3     ; Make it lower case (if A is a letter, checked next)

                cmp     a,'a'           ; 3     ; A is not a letter?

                jrc     lbl1            ; 2/3

                cmp     a,'{'           ; 3     ; A is not a letter?

                jrnc    lbl1            ; 2/3

lbl4:           dec     il              ; 3     ; Decrement string length

                jrnz    lbl3            ; 2/3   ; String length is not zero?

                                        ; =43 cycles for each subsequent character matched

;               FOUND A MATCHING WORD IN THE DICTIONARY

lbl5:           add     ba,ba                   ; Check immediate bit stored in B

                mv      ba,x                    ; BA holds the execution token

                popu    x                       ; Restore IP

                pushu   ba                      ; Save new 2OS execution token

                mv      ba,-1                   ; Set new TOS to -1, word is not immediate

                jrnc    lbl7                    ; Immediate bit is unset?

                mv      ba,1                    ; Set new TOS to 1, word is immediate

                jr      lbl7

;               NOT FOUND

lbl6:           popu    x                       ; Restore IP

                sub     ba,ba                   ; Set TOS to zero

                pushu   ba                      ; Set 2OS to zero

lbl7:           jp      !cont__

The new code is only one byte longer when assembled to binary than the old code!

- Rob