(03-12-2014 10:14 PM)Marcus von Cube Wrote: Let's talk about facts again.
The Atmel processor is a very limited device. It's fast (at high clock speeds) but it's lacking memory to an extend which made Pauli and me geniuses in saving flash and RAM space. This has several impacts:
We cannot use the processor to its full extent because flash is so limited that we had to use the so called thumb mode which features a reduced instruction set with a smaller memory footprint at the cost of slower execution (limited instruction and processor register set). The code is optimized for size, not for speed. There are some crude manual optimizations just to squeeze out as much user flash for library functions as possible.
The limited RAM forces us to pack everything as tight as possible which slows down access (e. g. bit fields for flags).
OTH, limited means simple: no dynamic memory (and thus no memory leaks), all sizes are known, memory is moved and not linked via pointers, ...
To answer your questions about different memory types in the device. There are three:
1. Non volatile RAM, 2KB. This is user memory, holding all the registers, flags, and the program steps which can be edited directly on the device.
2. Volatile RAM, 4 KB. This can only be used temporarily for internal operations.The contents is lost whenever the processor goes idle, that is when user input is being awaited. We use it as the internal processor stack, and for most internal calculations.
3. Flash memory, 128 KB. Most of it contains the firmware. Some KB are available to the user to backup of the non volatile RAM (2KB) and for keystroke programs known as the library. These cannot be directly edited but copied to and from non volatile RAM or compiled externally and sent to the calculator as part of the firmware (calc_full.bin). This memory area is slightly slower than the other two but it doesn't really matter for the keystroke programs stored in flash. The impact on firmware execution speed is of more importance.
To add some info about what Pauli wrote about access to local and global registers:
Global registers move around in memory, depending on the number of registers allocated (REGS command), and whether double precision mode is in effect or not. Thus, finding a specific register needs a few instructions (shifts and adds). Local registers a similar in nature: There size is variable (double precision or not) and their location is dynamic: they live on the subroutine return stack, the same that manages XEQ and RTN. The effort to find such a register in memory is comparable to that of finding a global register, just a few adds and shifts. The distinction between local and global is done through the internal index: 0 to 111 is global, 112 and higher is local and requires a different access path. To be precise, the range 100 to 111 for the lettered registers requires its own scheme but it deviates only minimally from that for 0 to 99. The effects on execution time should be negligible.
When local registers are allocated, RTN is a bit more involved because it not only needs to pull the return address off the subroutine return stack but also has to skip the local register frame and needs to setup some internal pointers to reactivate the local registers of the caller. Repeating LocR commands on the same stack frame (adjusting the available number of registers) has to move some memory around which may affect performance. The latter should be a rare case anyway.
Just enjoy the features. That's why we have implemented them.
Marcus,
I hugely appreciate this outstanding explanation. It answers most of my questions. I am impressed that you took the time to write this lengthy insight on the forum. I hope many other 34S users can read it as well.
I am a daily full-time end-user of "your" WP34S and I could not easily perform my daily chores without it. My interest in the technical details is simply in order to make the fastest and most efficient program that I can. I do indeed enjoy all of the features. Obviously, Walter's manual makes it all possible for me as well.
Thanks again to all of you involved in the 34S,
- Sikuq
PS: All those mathematical functions in the 34S libraries, just how do they work?