HP Forums
Can HP Prime be faster? - firmware performance comparison - Printable Version

+- HP Forums (https://www.hpmuseum.org/forum)
+-- Forum: HP Calculators (and very old HP Computers) (/forum-3.html)
+--- Forum: HP Prime (/forum-5.html)
+--- Thread: Can HP Prime be faster? - firmware performance comparison (/thread-20403.html)



Can HP Prime be faster? - firmware performance comparison - komame - 08-27-2023 09:52 AM

A few years ago, while writing various programs and experimenting with the HP Prime, I noticed that after one of the firmware updates, something that used to work smoothly suddenly started lagging.
I know that the HP Prime can execute programs at different speeds, even when running the same program multiple times. However, this has already been discussed on this forum, and that's not what I want to talk about now. I observed that different firmware releases perform certain operations at different speeds.

Several years have passed, and I finally found the time to dedicate to this issue, and believe me, it took me many hours.

All the actions described below were performed on a real HP Prime G1 rev. A.

Firstly, I wrote a program in HPPPL and prepared an equation - a sum to be calculated:
Code:
EXPORT PERF_TEST()

BEGIN
  local s_tck,e_tck,x,y,c,lst,st;
  PRINT();

// ===> test 1 <===
// list manipulation
  PRINT("=== TEST 1: list manipulation ===");

  TICKS▶s_tck;
  FOR c FROM 1 TO 10 DO
    lst:=MAKELIST(X,X,1,10000);
    FOR x FROM 1 TO 10000 DO
      lst(x):=10000-x;
    END;
  END;
  e_tck:=TICKS;
  PRINT("TIME: " + (e_tck-s_tck) + " ms");

// ===> test 2 <===
// string manipulation
  PRINT("=== TEST 2: string manipulation ===");
  TICKS▶s_tck;
  st:="";
  FOR x FROM 1 TO 800 DO
    st:=st+"ABCDEFGHIJKLMNOPQRSTUVWXYZ";
  END;
  REPEAT
    st:=RIGHT(st,DIM(st)-1);
  //PRINT(st);
  UNTIL DIM(st)<2;
  e_tck:=TICKS;
  PRINT("TIME: " + (e_tck-s_tck) + " ms");

// ===> test 3 <===
// fill screen
  PRINT("=== TEST 3: fill screen ===");
  RECT_P(0); //clear screen
  TICKS▶s_tck;
  FOR y FROM 1 TO 239 DO
    FOR x FROM 1 TO 319 DO
      PIXON_P(x,y,255+2*x+y);
    END;
  END;
  e_tck:=TICKS;
  PRINT("TIME: " + (e_tck-s_tck) + " ms");

  PRINT("=> done");  
END;

[attachment=12455]

This program performs three tests:
operations on lists
operations on text strings
filling the screen with pixels in shades of blue color (pixel by pixel)

Time measurements are calculated as the difference between invoking the TICKS command.
For summation, the TEVAL command was used.

To ensure the most reliable results, it was necessary to eliminate any additional factors that could affect the performance of the HP Prime. We know that having more programs in the program catalog can impact performance, so before each test run on a specific firmware (also after each flashing), I always restored the factory settings (F+C+O reset), clearing everything from memory.
Then, I connected the HP Prime to the HP Connectivity Kit momentarily and uploaded only the above PERF_TEST program, making it the only program in the catalog. After that, I disconnected the HP Prime from USB, as the connection negatively affects performance due to continuous communication with the PC.
With a clean stack, I ran this program three times directly from the program catalog for each firmware. After completing the first run, I pressed ON key and immediately started the second run, then the third run in the same way - without leaving the program catalog, while simultaneously recording the results in a spreadsheet on my PC. Next, I went to the HOME screen and ran the summation according to the above example - also three times, but this time there were three consecutive stack entries:
[attachment=12456]

This concludes the test for a single firmware.

I noticed that even if I restore the factory settings twice (F+C+O) and load this program into memory after each restoration, the execution time might differ minimally. However, these differences were rather at the level of 1-2% (which sometimes still resulted in 150ms, but nevertheless, these differences fell within the tolerance range for each firmware). If someone were to attempt to replicate my test, it's possible that instead of 8700 ms, they could get results like 8580 ms or 8790 ms. However, running the test multiple times in a row always yielded similar results.

This program and the summation calculation were run on many firmwares. I had to reflash my G1 over 20 times in total. Sometimes, the results were so surprising that I went back to previously tested versions to run the test again and make sure the initial measurement wasn't done incorrectly.

I performed some tests on beta versions as well. If anyone considers these cases unreliable, they can simply ignore them.


RESULTS.
Here are the performance differences of the firmware versions presented as percentages relative to the current version 14730:
[attachment=12457]
From left to right, we can see the current firmware 14730, and the further to the right, the older releases. As you can see, only in a few cases were certain operations slower than the current firmware, the vast majority are faster.

In an extreme case, filling the screen with pixels (yellow series) in firmware 7820 is almost 48% faster than in the current version 14730. This is a colossal difference! It's visible to the naked eye.

During that time (2015), I was developing a game for the HP Prime, and between firmware versions 8151 and 10637 there was a sudden drop in graphics performance. I mentioned this on the forum then, but I didn't have enough time to prove it with these tests. The first reason was that back then, the performance significantly dropped during drawing (as seen for firmware 10637). The second reason was that when pressing a key, the performance could further drop by as much as 30-35% while holding down the key. Coupled with graphic operation issues, this performance drop became even more pronounced. However, Tim explained to me that there are areas within the firmware that HP programmers don't have control over. If that's indeed the case, there might not be much that can be done regarding this specific keyboard handling issue, but there could still be potential to address other instances where performance used to be better, and it's unclear why it's not anymore Smile

Of course, my tests don't cover everything, and they could be more detailed, including operations on matrices, trigonometric operations, or more advanced graphics operations. Maybe then there would be more conclusions to discuss. What I did was a very time-consuming process for me, so I limited it to these four factors.

I'd like to draw attention to how much these results fluctuate over time. This could suggest that there's still a lot that can be done with the performance of the HP Prime. The summation calculation (green series) even had an increasing trend, only experiencing a minor dip in the latest versions. I'm not referring to the sharp drop in 14588, as that's clearly a bug; something was off with the calculations in that firmware. Even calling the summation calculation twice on the stack caused a crash, and in this case, measurements were done after a soft-reset for each run, as there was no other way (I think this isolated result can be ignored). Irrespective of everything, this part of the test exhibited the most unstable results. It's likely that these areas undergo the most changes.

During the tests, I also noticed a change in pixel filling between firmware versions 12066 and 13333:
[attachment=12458]
You should know that this is not visible in the emulator; the effect can only be observed on a real HP Prime. I had to take photos of the screens to show you this comparison.

The difference in rendering color shade thresholds for individual values is clearly visible. In version 13333, they are very smooth. If this had an impact on performance, I think it would be beneficial if the programmer had the option to choose the rendering mode. However, looking at the graph (yellow series), we can see that version 13333 experiences a significant drop in the screen-filling test, yet version 13865 shows a substantial increase despite using this smoothing technique. This serves as further evidence that there is still much that can be done in this area, as the current firmware performs this nearly 14% slower.

Notice that this doesn't just apply to running an HPPPL program; since the equation also significantly changes its execution time, I suspect that tasks like plotting graphs or symbolic calculations (CAS) might also vary in execution time depending on the firmware version.

WHAT ABOUT THE G2?
I don't own an HP Prime G2, so I don't know how it works "under the hood" or how different from G1 the core system is. However, looking at firmware version 13865, which probably was the first for G2 and as we can see, one of the most efficient, especially from a G1 perspective (considering all measured factors), it's quite possible that the G2 could work faster too. Maybe someone could conduct a similar test for the G2.
I would also appreciate tests on other G1 models; a comparison between the current firmware and 7820 or 8151 would suffice.

I hope these pieces of information will be useful to someone and will positively impact the ongoing development of HP Prime calculators.

I'm also attaching a spreadsheet with raw data for individual tests (results in milliseconds).

That's all from my side.


RE: Can HP Prime be faster? - firmware performance comparison - matalog - 08-27-2023 08:23 PM

That's very interesting, thanks for taking the time to condust the tests and sharing your insights with us.


RE: Can HP Prime be faster? - firmware performance comparison - komame - 08-29-2023 08:03 AM

I had another thought.

Let's take a closer look at the third test (chosen based on the largest amplitudes on the graph).

Code:
// ===> test 3 <===
// fill screen
  PRINT("=== TEST 3: fill screen ===");
  RECT_P(0); //clear screen
  TICKS▶s_tck;
  
  FOR y FROM 1 TO 239 DO
    FOR x FROM 1 TO 319 DO
      PIXON_P(x,y,255+2*x+y);
    END;
  END;
  e_tck:=TICKS;
  PRINT("TIME: " + (e_tck-s_tck) + " ms");

What does it exactly do?

Let's break it down into its fundamental components and analyze what's happening.

When we start measuring time, three activities are performed alternately:
- Loop iteration (outer loop 239 times, inner loop 319 times, resulting in a total of 76,241 iterations)
- Color calculation, involving mathematical operations like multiplication and addition (255+2*x+y)
- Turning on a pixel in the calculated color

One iteration turns on one pixel, and there are quite a few other tasks involved in achieving this. The act of turning on a pixel itself might not be the main performance-determining factor.

This means that this test gave us an averaged result for these three operations. Between firmware versions where the performance dropped, the actual pixel drawing operation might not have been slower at all. Assuming hypothetically that the issue was with the mathematical calculations, which became significantly slower in a particular firmware version, it could be possible that turning on the pixel itself was even faster. However, in combination with the performance drop during calculations, the overall result turned out worse.

A much better test would be to verify each of these individual components independently, including:
- Pure loop iteration, performed a larger number of times, without any content;
- Color calculation performed 76,241 times in a row without the loop;
- Turning on a pixel executed 76,241 times in a row without the loop.

While it's relatively easy to conduct such a test for pure loop iteration, the other two cases would ideally involve creating a program with 76,241 lines of code, each performing the same operation. However, that would result in a gigantic program, and other factors could influence performance, such as program size. Therefore, the best compromise in this case would be to execute the loop with a smaller number of iterations while performing the calculation 10 times or turning on 10 pixels (can be the same pixel in the one constant color) in each iteration. This would significantly minimize the impact of the loop structure on the individual pixel-turning operation.

Such a test would truly reveal the details of what caused the earlier versions of HP Prime to be faster.
The only issue is that I have to repeat these tests, which scares me. Flashing the unit multiple times again. I hope it will be worth it, and someone from the HP developers will make good use of it.


P.S. I'm eager for someone to confirm my results on a different unit (unfortunately, I only have one, and the emulator isn't suitable). Simply comparing the latest firmware with another version where the performance is noticeably better would be sufficient. I would be very grateful for any assistance.

Best regards


RE: Can HP Prime be faster? - firmware performance comparison - Tim Wessman - 08-29-2023 09:02 AM

Can't check anything from back then, but sure seems like you hit the change to a 32bit color instead of the prior 16bit between those revisions...

As features were added in the programming language, and as the number of keywords went up that may have impacted things slightly. The bigger problem always was the shitty memory management we had no control over as it came from the OS and which despite putting lots of tricks to try and avoid always ended up with badly fragmented memory...


RE: Can HP Prime be faster? - firmware performance comparison - komame - 08-29-2023 09:59 AM

Tim,

I understand that you had limited influence over memory management. What concerns me is the fluctuating performance with each release. It's not always slower, and although the latest release is indeed slower, there were instances in the past when it was worse (e.g. 14181). However, later releases, despite the addition of extra keywords and other firmware enhancements, managed to improve performance again in 14425 (which isn't an old firmware either), even surpassing the speed of 13865 for some benchmarks. I've tested this across various programs (not just the test one), and it's a consistently repeating issue for specific firmware.
If a particular firmware runs slower, it operates at a reduced speed (to varying extents) across the board. I suspect that this applies to every aspect, including built-in applications (which maybe I'll confirm through further research at some point).

The memory management issue does seem like a plausible explanation, but on the other hand, we can investigate which recent changes might have led to the decrease in performance. By systematically excluding certain changes, we can aim to work with those modifications in subsequent releases, ensuring that the once achieved high level of performance is maintained. While this might not always be straightforward, and compromises might be necessary, it's possible to measure and attempt to manipulate this situation.

But yes, this could be a daunting task if there's no control over the OS.


RE: Can HP Prime be faster? - firmware performance comparison - komame - 08-29-2023 07:27 PM

I have performed the additional measurements as announced, which involved dividing the third test from the previous study into three parts.

Each part focuses on a specific action:
- The first part solely iterates in an empty loop;
- The second part performs color calculation (a simple operation involving multiplication and addition) and saves it to a variable 10 times in each loop iteration, aiming to minimize the impact of iteration on measurements. The total number of color calculations is the same as in the initial study;
- The third part turns on a pixel in a constant color, also 10 times in each loop iteration. However, here, the screen is not filled due to the lack of iteration along the Y-axis. As a result, the same pixels in a single line are turned on multiple times.

This provided interesting results that confirmed what I previously mentioned. Breaking down the actions further highlighted the areas where the greatest differences in execution speed occur. Here are the results (table and chart):
[attachment=12473]

The performance favorite, 7820, demonstrates pixel activation speed that's 83% faster than 14730 (incredible, isn't it?).
A seemingly simple task like pixel activation doesn't appear to involve extensive memory management. Could there be something more going on here?
All of this seems very peculiar. These differences are too drastic, and the programs hardly do anything substantial. Even if we could manage to return to the speed of the previous firmware 14603, it would already be a promising start.
Looking at the chart, it's evident that the empty loop had an increasing trend in the recent versions, starting from 13333 up to 14603, which was only disrupted by the latest version, 14730, dropping by over 19%.
On the other hand, the shift to 32-bit mode mentioned by Tim, between versions 12066 and 13333, had a minimal impact or even no impact on performance, as the recorded decline is slight and could stem from something entirely different. Significantly larger drops and rises are visible both before and after the introduction of the 32-bit mode, so the activation of this mode probably had limited significance (which is good news).

If nothing can be done about this, all hope rests on Python, which is significantly faster compared to HPPPL (8-10 times faster). Hopefully, it can be further developed and brought to a stable version. However, if Python on the HP Prime relies on the same mechanisms underneath (given that the OS is the same), it might turn out that with each subsequent firmware release, performance will vary—sometimes faster, sometimes slower...

So, I conclude with a certain sense of incompleteness, yet simultaneously with hope that the HP specialists will come up with something that allows for the mitigation of these performance declines.

Best regards.


RE: Can HP Prime be faster? - firmware performance comparison - jte - 08-30-2023 12:36 AM

Piotr,

As someone who’s flashed HP Primes more than a few times, I can certainly appreciate the hours needed to be spent doing all this testing. Thank you for your efforts, and for sharing them with us.

I read over your post a few times and started writing a reply last night, but wanted to run one test because of the following:

(08-27-2023 09:52 AM)komame Wrote:  
During the tests, I also noticed a change in pixel filling between firmware versions 12066 and 13333:

You should know that this is not visible in the emulator; the effect can only be observed on a real HP Prime. I had to take photos of the screens to show you this comparison.

The change in the framebuffer format should be visible in the emulator. Today I tried (a portion of) your PPL program on revisions 14730 and 11226 of the emulator (on Windows, in both cases). I’m attaching two screenshots; upon close inspection, banding should be seen on the screen capture from the revision 11226 emulator.

Quote:The difference in rendering color shade thresholds for individual values is clearly visible. In version 13333, they are very smooth. If this had an impact on performance, I think it would be beneficial if the programmer had the option to choose the rendering mode.

While I was sleeping, Tim was posting Smile This shows the power of Tim! (And of time zones?) Just like for Tim, the framebuffer format change also immediately popped up in my mind. The change in the framebuffer format is a low-level change. (The low-level graphics routines were adjusted / rewritten.) Revision 13333 was the first public release with the new framebuffer format. The framebuffer format change doesn’t substantially increase rendering code complexity (it may even, arguably, simplify some parts), but it does result in more memory being moved around by the parts of the project involved with graphics. (Including, for example, refreshing.)

All that said, your numbers show efficiency changing both before and after the change in framebuffer formats.

One task I was hoping to get done this summer was setting up an automated system for running tests on physical HP Primes. (Initially for integration into the bug tracker I set up, but also to verify / produce reports on how code changes affect efficiency. For this latter use case, I was thinking more along the lines of: this part of the code is going to get changed, so (A) log some performance data, (B) make changes, and (C) log some performance data [and, perhaps, GOTO (B)]. But, clearly, monitoring for performance changes in a more general manner would be wise. I should note that completing automated testing is part of getting a release out. [My intent is to increase the amount of automation.])

I’ve a variety of performance improvements I’ve been meaning to get to implementing — on the parts of the system I implemented in the past (which, incidentally, don’t overlap with the parts of the system you’ve been testing…).

There’s certainly much more that can be said on the topic of efficiency improvements. Benchmarks suites that reflect real-world use cases (along with automated testing) can certainly help guide development. For the moment, though, to ensure this reply gets posted, I’ll stop here for the moment.

Thanks again!

Jeff

Revision 11226 emulator:
[attachment=12475]

Revision 14730 emulator:
[attachment=12476]


RE: Can HP Prime be faster? - firmware performance comparison - komame - 08-30-2023 09:47 AM

Jeff,

Thank you for showing interest in this topic. It makes me feel that the time I dedicated to it is not in vain and provides some hope that someone will address this issue.

Quote:The change in the framebuffer format should be visible in the emulator. Today I tried (a portion of) your PPL program on revisions 14730 and 11226 of the emulator (on Windows, in both cases). I’m attaching two screenshots; upon close inspection, banding should be seen on the screen capture from the revision 11226 emulator.

Reading your answer, I realized that my description was not entirely clear, in fact it was even wrong. When conducting my tests, I did not use an emulator and everything was performed on a real HP Prime device. I didn't install emulators with older firmware versions, as it wasn't necessary for my purposes. However, the screenshots I took for documenting results came from the "Monitor" tool available within the HP Connectivity Kit, where I could view the device's screen on my PC via USB connection. In this regard, regardless of the firmware version (12066/13333), the image displayed was the same. However, I must clarify that this is not an emulator but rather the HP Connectivity Kit's "Monitor" feature. I sincerely apologize for my imprecise words.

Quote:The framebuffer format change doesn’t substantially increase rendering code complexity (it may even, arguably, simplify some parts), but it does result in more memory being moved around by the parts of the project involved with graphics. (Including, for example, refreshing.)

All that said, your numbers show efficiency changing both before and after the change in framebuffer formats.

You're correct, transitioning from a 16-bit mode to a 32-bit mode should indeed simplify the code. This stems from the fact that the 16-bit mode (e.g., 565 mode) requires a number of bit-level operations (shifts, bitwise AND/OR) to enable a single pixel, whereas the 32-bit mode allows manipulation of whole bytes with straightforward assignment operations (simplistically speaking), without bit-level manipulation (I've written a set of graphics commands directly in ARM assembly code that draw lines, ellipses, bitmaps in 565 mode, so I'm familiar with this).

Returning to the topic of performance, after further analysis, I now believe that this change positively impacted performance. Let's reconsider the excerpt of the graph where the 32-bit mode was introduced.
[attachment=12483]
All three data series experienced a sudden drop (a global influence affected all three processes). However, it's evident that the yellow series dropped half as much as the other two series. While iterating an empty loop and simple numeric calculations dropped by about 13% in percentage terms, pixel activation dropped only by 7%. Had this global factor not contributed to the drop, we might have observed a 6-7% increase in pixel activation performance.

I'm genuinely pleased to hear about plans for performance improvements, and I'm hopeful that with the next, or perhaps subsequent, firmware versions, we'll notice some positive changes in this regard.

Thanks!


RE: Can HP Prime be faster? - firmware performance comparison - Insoft - 09-06-2023 06:51 PM

If you’re looking for faster performance from PPL look at Python
Python is way faster at graphical stuff like setting pixels drawing lines rectangles etc… v PPL

PPL is a snail compared with Python and with Python code embedded in PPL using
Code:
#PYTHON name
#end
EXPORT START()
BEGIN
PYTHON(name);
END;
One can have the best of two worlds tho I did currently ran into issues with Python regarding large size lists so PPL still has it’s uses


RE: Can HP Prime be faster? - firmware performance comparison - komame - 09-07-2023 07:59 AM

(09-06-2023 06:51 PM)Insoft Wrote:  If you’re looking for faster performance from PPL look at Python
Python is way faster at graphical stuff like setting pixels drawing lines rectangles etc… v PPL

My goal was not to measure operations in graphics, but rather the overall performance of the HP Prime environment, which varied across different firmware versions. The main point of this thread is that in the past, HP Prime (with older firmware) was significantly faster, at least in the area of PPL operations. If you were to write a complex algorithm in PPL that calculates something on a list variable, and with the latest firmware, it takes 5 minutes to compute the result, running it on older firmware would only take 3 minutes (which I confirmed by flashing my HP Prime to an older version). My considerations aimed to reveal this fact and perhaps prompt the new firmware versions to restore the performance that was present in the older versions.

Returning to Python, there is also a notation where you don't need to use the PYTHON keyword in the PPL part.
Code:
#PYTHON name()
print('Python test')
#end

EXPORT START()
BEGIN
name();
END;

In the Python area, performance differences also exist for different firmware versions on the HP Prime, but they are significantly smaller. Python also ensures much greater measurement repeatability; for instance, a hard reset is not necessary because individual runs consistently yield similar results, often with precision down to single milliseconds. In the case of PPL programs, running the same program twice can yield varying duration measurements by even a few percent, indicating a certain level of instability in this language in that regard.

For example, let's examine this Python program (just empty loop for 5 million iterations):
Code:
#PYTHON python_test()
for i in range(0, 5000000):
  pass
#end

EXPORT START()
BEGIN
local t;
t:=TICKS;
python_test();
MSGBOX(TICKS-t);
END;

If you run it on G1 firmware version 14730, it will execute in 12055 ms, and on firmware version 14588, it will execute in 11699 ms (approximately 350 ms faster).

Nevertheless, Python is still an excellent candidate to succeed PPL, and if a stable runtime environment can be achieved, it will undoubtedly become the fastest Python on a calculator in the world, greatly enhancing its potential and providing much greater capabilities. Perhaps in this situation, it's not worth wasting time (and money) trying to improve PPL's performance; it might be better to focus on enhancing Python instead.


RE: Can HP Prime be faster? - firmware performance comparison - Tyann - 09-07-2023 06:39 PM

Bonjour

Si vous permettez une petite critique, vous n'êtes pas le seul utilisateur de la HP prime
et d'autres ( dont je fais partie ) ont peut être moins d'intérêt pour le Python et plus
pour HP PPL.

Hello

If you don't mind a little criticism, you're not the only user of HP prime
and others (myself included) may have less interest in Python and more for HP PPL.


RE: Can HP Prime be faster? - firmware performance comparison - komame - 09-07-2023 07:31 PM

(09-07-2023 06:39 PM)Tyann Wrote:  If you don't mind a little criticism, you're not the only user of HP prime
and others (myself included) may have less interest in Python and more for HP PPL.

I'm sorry if I was misunderstood. It was simply a response to the statement made by my predecessor, who mentioned the possibility of using Python. Personally, I also prefer PPL, which is why I spent so much time conducting this series of research and creating the charts you can see in the first post of this thread. As you can see, everything was done based on PPL, and I would very much like the values on these charts to increase, opening up new possibilities for development in PPL.
On the other hand, I'm a realist, and I know that since HP has already invested in Python, where it's easy to find thousands of ready-made libraries and solutions, whereas in PPL, you would have to write them yourself, they probably didn't do it with the intention of further developing PPL. Additionally, PPL has many limitations and issues that would require significant efforts to address, such as limitations on list and matrix (table) length, not to mention performance, where Python can perform calculations nearly ten times faster than PPL, enabling the execution of much more advanced algorithms.
I do, however, have a faint hope that something good will still happen with this PPL Wink