Threaded Mode | Linear Mode

Claudio L. · 10-23-2014, 10:11 PM

(10-22-2014 01:10 PM)Claudio L. Wrote: 30% seems a lot for anybody. When I watched your presentation, I understood that 30% was also including all your other optimizations. But each model will have a different impact. I'm with you, if the difference is 30%, it's not minor by any means. But on other models I don't think that it will be a big impact. If I remember correctly for the C/Regvars, the change in speed was less than 10%, and that included disabling all compiler optimizations, not just that indirection.

Quoting myself for reference here.
I did a few tweaks and created 50g RPL versions of the problem with the following conditions:

1) Original code
2) Stack code, using the values in the stack rather than a list
3) Original code with double indirection removed
4) Stack code, with double indirection removed
5) Bitmap code

I'll download and post the code for all cases over the weekend.
Right now let's go straight to the point. These results are on a 50g at normal speed:

1) Original code = 90 seconds
2) Stack code = 60.6 seconds
3) Original w/no indirection = 67.6 seconds
4) Stack code w/no indirection = 57.7 seconds
5) Bitmap code = 54.3 seconds

A few observations:
3 vs. 1: When using (slow) lists, removing that indirection has a big impact. The advantage comes not much from the read indirection, but mainly from not writing the value back to a list (which forces creation of an entire list every time) until the end of the inner loop.
2 vs 1: As I mentioned before: lists are slow. Using the stack provides a boost.
2 vs 3: When you remove the list creation from the inner loop, the effect of the lists is not as bad, so the difference is smaller.
4 vs 2: With lists out of the picture, the penalty for the indirection within the stack is minor (as it should be).
5 vs 2, or 5 vs 4: The Saturn CPU isn't particularly good at bit manipulations, so this technique is not such a massive improvement over the normal solution. Still edges faster.

Conclusions:
a) The effect of the indirection is large when using lists (PUT much more than GET).
b) The effect of the indirection removal is small when using the stack
c) The bitmap code is still faster than the normal approach, but not by much in this model.

Claudio