(10-08-2021 01:04 PM)robve Wrote: However, the dictionary search can be optimized, like most Forth implementations. For example, the HP-71b limits searching based on the word length, thus checks dictionary entries for words of the same length only. Other implementations use trees or hashing. There are also simple and practical ways to speed up dictionary search, which I will try. For starters, comparing the length and the first character simultaneously to check a dictionary entry will speed things up.
A quick update for those interested in this project, or in Forth, or in the E500's CPU.
The new FIND-WORD assembly code listed further below runs about twice as fast as the old FIND-WORD code (the version shown in the previous post). This means that case-insensitive dictionary searches in Forth500 should speed up quite a bit. Loading and compiling a Forth source file is largely determined by dictionary search speed.
The new CPU cycle stats compared to the old FIND-WORD, expressed in CPU cycles per word compared:
mismatching length: old = 54 cycles, new = 34 cycles
matching length but first characters differ: old = 108 cycles, new = 48 cycles
matching words, character-by-character comparison: old = 53 cycles, new = 43 cycles
The cost of a word length mismatch is 34 cycles. If the length matches, the cost of a first character mismatch is 48 cycles total (i.e. including the length match).
Assuming a directory size of 519 words (expected with Forth500), this means that a full dictionary search takes 23ms to 32ms or slightly longer, depending on the word being searched:
34x519/768KHz = 23ms
48x519/768KHz = 32ms
For example, an integer value 123 in the Forth source input matches the length of all 3-character words, but matches none of the words that start with a 1 thus taking 48x519 cycles to complete or 32ms. Explanation: all words, including integers, are first searched in the dictionary before pushed on the stack or compiled as an integer.
The new FIND-WORD assembly, annotated with CPU cycles (disclaimer: this may not be the final version):
Code:
find_word: dw to_body
db $09
db 'FIND-WORD' ; ( c-addr u -- 0 0 | xt 1 | xt -1 )
find_word_xt: local
mv (!gl),a ; (gl) holds the string length (length < 64 checked next)
mv il,64 ; Compare the string length
sub ba,i ; to the max of 63 characters
popu ba ; BA holds the string address
pushu x ; Save IP
jrnc lbl6 ; String too long?
mv y,!base_address
add y,ba ; Y holds the string address
mv (!fl),[y++] ; (fl) holds the first character of the string to search
mv (!yi),y ; (yi) holds the string address + 1
mv (!zi),y ; Set 2nd byte of (zi) to base address segment $b
mvw (!zi),[!last_xt+3] ; (zi) holds the 20 bit LAST address
; LOOP OVER DICTIONARY
lbl1: mv y,(!yi) ; 5 ; Y holds the string address + 1
mv il,(!gl) ; 4 ; IL holds the string length
; =9 cycles
; NEXT WORD IN THE DICTIONARY
lbl2: mv x,(!zi) ; 5 ; X holds the address of the dictionary entry
or (!zi),(!zi+1) ; 6 ; Check if the address of the dictionary entry is zero
jrz lbl6 ; 2/3 ; Dictionary entry address is zero?
mvw (!zi),[x++] ; 7 ; (zi) holds the previous dictionary link address
mv ba,[x++] ; 5 ; A holds the word length and B holds the first character
; COMPARE STRING LENGTHS
sub a,il ; 3 ; Compare string lengths
test a,$7f ; 3 ; Check string lengths, ignore immediate bit, keep smudge bit to force mismatch
jrnz lbl2 ; 2/3 ; String lengths are not the same?
; =33 cycles +1 for jump if the length does not match
; COMPARE FIRST CHARACTERS
ex a,b ; 3 ; B holds immediate bit to save for later, A holds first character
xor a,(!fl) ; 4 ; Compare first characters
jrz lbl4 ; 2/3 ; First characters match?
test a,$df ; 3 ; Check if case insensitive bits match
jrnz lbl2 ; 2/3 ; Case insensitive characters differ?
; =33+14=47 cycles +1 for jump if the length does not match and the first character did not match
mv a,(!fl) ; 3 ; A holds the first character of the string to search
or a,$20 ; 3 ; Make it lower case (if A is a letter, checked next)
cmp a,'a' ; 3
jrc lbl2 ; 2/3 ; A is not a letter?
cmp a,'{' ; 3
jrnc lbl2 ; 2/3 ; A is not a letter?
dec il ; 3 ; Decrement string length
jrz lbl5 ; 2/3 ; String length is zero?
; =47+22=69 cycles if the length matched and the first character matched
; LOOP OVER STRINGS TO COMPARE
lbl3: mv a,[x++] ; 4 ; A holds the next charater of the word
mv (!el),[y++] ; 6 ; (el) holds the next character of the string to match
xor a,(!el) ; 4 ; Compare characters
jrz lbl4 ; 2/3 ; Characters match?
test a,$df ; 3 ; Check if case insensitive bits match
jrnz lbl1 ; 2/3 ; Case insensitive characters differ?
mv a,(!el) ; 3 ; A holds the next character of the string to match
or a,$20 ; 3 ; Make it lower case (if A is a letter, checked next)
cmp a,'a' ; 3 ; A is not a letter?
jrc lbl1 ; 2/3
cmp a,'{' ; 3 ; A is not a letter?
jrnc lbl1 ; 2/3
lbl4: dec il ; 3 ; Decrement string length
jrnz lbl3 ; 2/3 ; String length is not zero?
; =43 cycles for each subsequent character matched
; FOUND A MATCHING WORD IN THE DICTIONARY
lbl5: add ba,ba ; Check immediate bit stored in B
mv ba,x ; BA holds the execution token
popu x ; Restore IP
pushu ba ; Save new 2OS execution token
mv ba,-1 ; Set new TOS to -1, word is not immediate
jrnc lbl7 ; Immediate bit is unset?
mv ba,1 ; Set new TOS to 1, word is immediate
jr lbl7
; NOT FOUND
lbl6: popu x ; Restore IP
sub ba,ba ; Set TOS to zero
pushu ba ; Set 2OS to zero
lbl7: jp !cont__
The new code is only one byte longer when assembled to binary than the old code!
- Rob