Emulator vs simulator performance - Printable Version +- HP Forums (https://www.hpmuseum.org/forum) +-- Forum: HP Calculators (and very old HP Computers) (/forum-3.html) +--- Forum: General Forum (/forum-4.html) +--- Thread: Emulator vs simulator performance (/thread-15140.html) |
Emulator vs simulator performance - J-F Garnier - 06-05-2020 03:03 PM The question arises to know what is the order of magnitude of the performance penalty of an emulator versus a native application. (06-05-2020 12:39 AM)Valentin Albillo Wrote:Quote:An emulator adds the overhead of the CPU/hardware simulation; Free42 is a native application. All depends on the performance of the emulation engine, I can't really judge the x60 ratio but it's the order of magnitude we can expect Ok, so let's compare Free42 and Emu42, that have similar functionalities and are both high quality software recognized by the community, and let's compare them on the same host system. You may say, it's not fair, Free42 provides much better accuracy with its 35-digit arithmetic, so let's also include the binary version that has similar arithmetic accuracy. I will not use a trivial benchmark but the program from the latest article from Valentin, that computes the area of the Mandelbrot set (great article, Valentin, I may comment it later in an other thread). The program is run to evaluate 10,000 points with the other default parameters. The display is fully static during Valentin's program run (no display updates, no flying goose). Emu42 and Free42 are run separately, during the run they each use about one of the core of my core-i3 machine (global CPU loading ~25-30%). Of course, Emu42 is run with authentic calculator speed off. Emu42 1.24 : 5min08s Free42 2.5.18 decimal : 6.3s Free42 2.5.18 binary : 2.0s Emu42 execution time measured by hand, Free42 execution measured using TIME. J-F RE: Emulator vs simulator performance - David Hayden - 06-05-2020 03:26 PM Interesting. So the 60x slowdown claim appears fairly accurate. Decimal free42 is about 49 times faster than emu42. Eric Smith can probably speak more knowledgeably than I can, but here are some thoughts. The Saturn CPU is a pretty different from an x86. Emulating it might be harder than a more traditional one. Things like register fields might be hard. And if the Saturn BCD arithmetic doesn't translate well to x86 then that could be a problem. Also, it's my understanding that the 42s is implemented in SysRPL. That adds another layer of complication and slowdown compared to assembly language. As I recall working with C on a 50g, the speed difference there is something like native C code is about 5x faster than Saturn assembly, which is about 5x faster than SysRPL, which is about 5x faster than User RPL. Those values may vary somewhat, but I do recall specifically that C code is about 100x faster than userRPL on the 50g. Dave RE: Emulator vs simulator performance - J-F Garnier - 06-05-2020 03:45 PM (06-05-2020 03:26 PM)David Hayden Wrote: Also, it's my understanding that the 42s is implemented in SysRPL. That adds another layer of complication and slowdown compared to assembly language. As I recall working with C on a 50g, the speed difference there is something like native C code is about 5x faster than Saturn assembly, which is about 5x faster than SysRPL, which is about 5x faster than User RPL. Those values may vary somewhat, but I do recall specifically that C code is about 100x faster than userRPL on the 50g. Yes, the 42S is internally using RPL. But Free42 is written in C, not assembly language, so the comparison is fair. On the other hand, in this particular benchmark, we can expect a significant part of the time is spent in arithmetic operations that are in asm in the HP-42S, not sure in the Intel 35-digit library. My own Emu71/DOS is largely written in assembly (using Intel x86 BCD support for efficiency - at the cost of minor compatibility issues) but it's a 16-bit DOS application and there is no native HP-71 simulator available for comparison. J-F RE: Emulator vs simulator performance - Valentin Albillo - 06-05-2020 04:24 PM . Hi, J-F: (06-05-2020 03:03 PM)J-F Garnier Wrote: I will not use a trivial benchmark but the program from the latest article from Valentin, that computes the area of the Mandelbrot set (great article, Valentin, I may comment it later in an other thread). Thank you very much, you're most welcome to comment to your heart's content. Quote:The program is run to evaluate 10,000 points with the other default parameters. The display is fully static during Valentin's program run (no display updates, no flying goose. (How do you know it's a "goose" and not a "gander" ? Oh, I remember, the gander flies backwards. At least it does in the HP-41C family, as the HP42S (and Free42) uses a little triangle instead) Quote:Emu42 and Free42 are run separately, during the run they each use about one of the core of my core-i3 machine (global CPU loading ~25-30%). Quite a difference, indeed. Emu42 is calculating ~32 points/sec, Free42 BCD is doing ~1,587 points/sec and Free42 binary does 5,000 points/sec. The difference between 32 p/s and 5,000 p/s is ~156x, which seems to me unreasonable performance difference for an emulation (Emu42) vs. a simulation (Free42), that's more than two orders of magnitude faster (or slower) on the same hardware (and presumably OS). Also, your Free42 BCD is running only ~75% faster on your hardware than mine on my mid-range Samsung tablet. Frankly, I would expect it to run much faster in your system (say, at least 4x). Puzzling results to me, all of them. Thanks and regards. V. RE: Emulator vs simulator performance - Werner - 06-05-2020 04:56 PM (06-05-2020 03:45 PM)J-F Garnier Wrote: Yes, the 42S is internally using RPL. But Free42 is written in C, not assembly language, so the comparison is fair... but Free42 is compiled, and the 42 SysRPL is interpreted. Werner RE: Emulator vs simulator performance - Christoph Giesselink - 06-05-2020 07:27 PM I only can speak about Emu42, I don't know any details about Free42 inside. From early times I put the focus of my emulator development on rebuilding the original hardware on functional level as close as possible. I also expect that an emulation of a real machine is faster, or at minimum, have equal speed. The Saturn CPU emulation inside Emu42 was originally build by Sebastien Carlier for Emu48 and was improved by me in the Emu48 development over the years. This emulation core was never optimized for speed reason. One anecdote belongs to the Saturn opcode dispatcher. The actual emulator code decoding the Saturn opcode in tables one nibble each. This makes the tables small because 1 nibble = 4 bit -> 2^4 = 16 conditions in a table. Cyrille thought, why not decode 3 nibbles (12 bit) of the opcode at first time. So the first opcode dispatcher table had a size of 2^12 = 4096 entries. On a PC the emulator with 4096 dispatcher table was about 10% faster than the one with 16 entries. Mission accomplished? Not really, the optimization was done for the Pocket PC's with Windows CE or Pocket PC 2002 devices. On these devices the new version was massively slower. And why? The 4096 entry dispatcher table hasn't fit into the 1st level CPU cache and so all accesses to this table must be done over the slow main memory. So what is important for me? The speed difference between original calculator and emulation. So I think a 170 to 340 times faster emulation is fast enough. This is a benchmark list from 2011 comparing the emulation on different host systems comparing to the real machine: Code:
Making repeatable benchmarks on the HP42S are not so easy as it sounds. First of all, the FOCAL code can only search for global labels before his actual position, so is the search position on top of memory and the label is not found so far, the search continues at the .END. position. So if you have many programs with many global labels behind your program, this will slow down program execution. One more detail, these was a bug in the RAW file object loader until Emu42 v1.22. The FOCAL program object loader allows also to import HP41 FOCAL programs saved by the V41 emulator. Because of some internal differences about NULL byte handling, NULL bytes are removed (packing) or added (behind numbers) and so the distance between labels change. The distance on global labels was fixed during the import, the distance on local labels not. This caused execution errors using HP41 programs. Therefore the import now clears the distance information in all local label jump and execute FOCAL opcodes. Don't worry, the HP42 restore these offsets at the first program run, and because of this the first run of a FOCAL program directly after importing is slower then the following runs. But now to a further difference of emulation and simulation. The Emu42 emulator has to handle some speed related issues running the code of an original ROM. Just remember, the authors of the code haven't thought about running this code 100-400 times faster. So some parts are just done by executing code in a loop to create a delay or the frequency of a beeper. On the HP48 the backarrow key has an autorepeat function. So pressing the backarrow key and holding it, removes slowly character by character in the command line. When you have a machine with runs 100 times faster, and you do the same thing, the input line is immediately empty. So happened with Emu48 running a HP48. But back to Emu42. It took me years to implement the Redeye sending and making the beeper emulation. In both cases the timing is done by the CPU executing opcodes. Moreover, the CPU strobe frequency is not very accurate, so not usable for sending the Redeye Printer protocol. So the ROM code is making a speed calibration of the CPU before printing. Therefore a loop with known CPU-cycles is executed in a time frame given by a timer referenced by the 32768Hz crystal. The number of loops is counted. Bad is, the register width of the loop counter is too small for a 150 times faster CPU execution, so the count register overrun many many times and so the result of the speed measurement was rubbish effecting the emulation of the Redeye frame transmitter and beep generation. I think this is an important difference between emulation and simulation. On the last Allschwil meeting I talked about speed update for my HP-92198 simulation. I done a print of a large HP71 BASIC program with Emu71 which took 180s. I don't know how fast is the original hardware, HP71 and HP-92198 video output, but it's slower. Now I cheated, the actual HP-92198 simulation does the same thing in 8s now. Why I say cheated? The program is compiled with the same C++ compiler running on the same machine. The difference is the display update. The prior version updated the display content after very new character, the actual version only every 30ms. So I modified the test conditions, the result for the user is the same, but it's a huge difference for the CPU. So when we speak about calculations we are discussing about numerical results of a mathematical problem. When I change the numerical algorithm for some reasons, it's the same cheating as with the display output, I changed the test conditions. So I think it's quite hard to compare Emu42 with Free42. Use the one which is more suitable for your problem. BTW how fast is Free42 comparing to Wolfram Mathematica solving the same problem? RE: Emulator vs simulator performance - Thomas Okken - 06-06-2020 04:37 AM (06-05-2020 03:45 PM)J-F Garnier Wrote: But Free42 is written in C, not assembly language, so the comparison is fair. The Intel Decimal Floating-Point Math Library is written in C, and works using 64-bit integer operations under the hood. I don't think there's any assembly language in there but I haven't really looked for it... but when building for ARM, there definitely isn't since that is not a supported target at all, and I had to do a bit of hacking to even get it to build for Android and iOS. When built for 32-bit targets, the 64-bit integer operations are compiled into multiple 32-bit operations, and that might benefit from coding it in assembly instead to eliminate duplicate operations (e.g. a 64-bit integer multiplication requires four 32-bit multiplications in general, but the special case of squaring a 64-bit number can be done using only three 32-bit multiplications), but this was not done. Free42 for Windows is a 32-bit app, and the Android and iOS versions contain 32-bit and 64-bit code, running the 32-bit code on 32-bit platforms and running 64-bit code otherwise. RE: Emulator vs simulator performance - J-F Garnier - 06-08-2020 07:53 AM (06-05-2020 07:27 PM)Christoph Giesselink Wrote: So what is important for me? The speed difference between original calculator and emulation. So I think a 170 to 340 times faster emulation is fast enough. This is a benchmark list from 2011 comparing the emulation on different host systems comparing to the real machine: I agree with you, the improved speed vs the original is an important benefit, but not the unique goal of an emulator. Thanks for the benchmark, I remember it now. Quote:So I think it's quite hard to compare Emu42 with Free42. Use the one which is more suitable for your problem. Well , not so hard to compare, since I did it :-) but yes I understand what you mean. My goal was just to have a more clear view of the performance difference. We can't expect an emulator to compete with native applications. I'm using both Emu42 and Free42, for different usages, as I'm using your Emu71 for Windows in some cases (very precise emulation) and my Emu71/DOS for others (integrated HP-IL environment) J-F RE: Emulator vs simulator performance - J-F Garnier - 06-09-2020 07:41 AM (06-06-2020 04:37 AM)Thomas Okken Wrote: When built for 32-bit targets, the 64-bit integer operations are compiled into multiple 32-bit operations, and that might benefit from coding it in assembly instead to eliminate duplicate operations (e.g. a 64-bit integer multiplication requires four 32-bit multiplications in general, but the special case of squaring a 64-bit number can be done using only three 32-bit multiplications), but this was not done. Free42 for Windows is a 32-bit app, and the Android and iOS versions contain 32-bit and 64-bit code, running the 32-bit code on 32-bit platforms and running 64-bit code otherwise. This may explain the observation from Valentin: (06-05-2020 04:24 PM)Valentin Albillo Wrote: Also, your Free42 BCD is running only ~75% faster on your hardware than mine on my mid-range Samsung tablet. Frankly, I would expect it to run much faster in your system (say, at least 4x). Also I'm using a modest 1.8MHz core-i3 machine, more powerful machines may give significantly better performances. J-F RE: Emulator vs simulator performance - Jonathan Busby - 06-09-2020 08:49 PM (06-05-2020 03:26 PM)David Hayden Wrote: [snip] I would put it as *extremely different* The only real similarity is that the x86 and Saturn family of processors both use a "dest = dest op source" machine code format for many of their arithmetical-logical operations, although the Saturn has many "dest = source op dest" machine instructions as well Quote:Emulating it might be harder than a more traditional one. Things like register fields might be hard. If you're writing the emulator in a language like C, then dealing with register fields means dealing with a lot of bit-masking and bit-shifts ( except for maybe the Emu48 family of emulators which seem to perform decimal operations ( and apparently HEX mode operations as well ) in a nibble serial manner if I'm not mistaken. This might explain why some of the emulators are so slow with respect to emulating Saturn machine code arithmetic and comparison instructions ) Quote:And if the Saturn BCD arithmetic doesn't translate well to x86 then that could be a problem. The BCD instructions on modern x86 CPUs are legacy instructions that have been relegated to the CPU's microcode ROM, and are therefore very slow. The 32-bit x86 CPUs only have instructions that operate on two packed BCD digits at a time. For example : Code: MOV AL, 42h which adds packed BCD 42 to 33, or : Code: MOV AL, 42h which subtracts packed BCD 33 from 42 . The "DAA" and "DAS" instructions stand for "Decimal Adjust for Addition" and "Decimal Adjust for Subtraction" respectively. In AMD64 / x64 "long mode", the above DAA and DAS instructions are not available, having been used as part of the 64-bit instruction set encoding. In C, especially on a 64-bit x64 machine, it's faster to use a little bit-twiddling and arithmetic trickery to perform packed BCD arithmetic. ( a quick google search turns up many solutions for addition and subtraction of packed BCD integers in C using only bitwise operations and arithmetic, eg. see this Wikipedia solution in C which uses just ten bitwise and arithmetic operations ( no multiplication or division ). Also, I do have a vague memory of a bitwise solution that only used *six* operations, but I can't remember where I when I saw it and I can't seem to reproduce it "de novo" either :/ . On many ARM processors, even the above ten operations can be reduced to just *eight* operations because of many ARM processors' "0-cycle" barrel-shifter which one gets for free with most ARM arithmetic or bitwise instructions ) ( EDIT #1 : The above statements about ARM-based processors' "free" or "0-cycle" barrel-shifter may not be entirely accurate, as I haven't done much ARM assembly language in a long while. I believe that the previously "free" shift operation may incur some performance penalties on more modern ARM processors, although I'm no expert on ARM assembly, so I'm not sure ) Quote:Also, it's my understanding that the 42s is implemented in SysRPL. That adds another layer of complication and slowdown compared to assembly language. Indeed RPL on Saturn CPU's involves at least *three* levels of indirection a lot of the time Quote:As I recall working with C on a 50g, the speed difference there is something like native C code is about 5x faster than Saturn assembly, which is about 5x faster than SysRPL, which is about 5x faster than User RPL. Those values may vary somewhat, but I do recall specifically that C code is about 100x faster than userRPL on the 50g. That sounds about right Regards, Jonathan P.S. The formatting seems to be messed up in the preview of this post with long lines running off-screen which generates a scroll bar. RE: Emulator vs simulator performance - J-F Garnier - 06-10-2020 07:57 AM (06-05-2020 04:56 PM)Werner Wrote:(06-05-2020 03:45 PM)J-F Garnier Wrote: Yes, the 42S is internally using RPL. But Free42 is written in C, not assembly language, so the comparison is fair... but Free42 is compiled, and the 42 SysRPL is interpreted. SysRPL is not exactly interpreted. An interpreter such as HP Basic uses tokens and relies on tables to get the execution address (on HP Basic, it's quite complex and relatively slow with all the possible LEXs to scan). In SysRPL, the "tokens" are the execution addresses themselves. The right term is probably "threaded code" as for the Forth language. But I'm not a RPL expert :-) (06-09-2020 08:49 PM)Jonathan Busby Wrote:Quote:Also, it's my understanding that the 42s is implemented in SysRPL. That adds another layer of complication and slowdown compared to assembly language.Indeed RPL on Saturn CPU's involves at least *three* levels of indirection a lot of the time What is the penalty of SysRPL compared to assembly language? I tried to answer by porting Valentin's program to the HP-32SII, emulated in Christoph's Emu42. So the comparison is done at constant CPU speed. Here is the result, for 10,000 points as above test: HP42S emulated on Emu42 1.24 : 5min08s (as above) HP32SII emulated on Emu42 1.24 : 2min01s ! So despite the 32SII RPN language is not as powerful for complex numbers than the 42S, the 32SII is 2.5x more efficient than the HP42S. I didn't expect so much difference, it's a surprise for me. The difference is not as large on the real machines, since the CPU speed is probably reduced on the 32S. The 42S is supposed to run at 1MHz, does somebody know the speed of the 32SII CPU? For the curious, here is the HP-32S program: Code: ; Mandelbrot area, for the HP-32S/SII J-F RE: Emulator vs simulator performance - Didier Lachieze - 06-10-2020 12:16 PM (06-10-2020 07:57 AM)J-F Garnier Wrote: The 42S is supposed to run at 1MHz, does somebody know the speed of the 32SII CPU? According to Wikipedia the Lewis CPU used in the 42S runs at 1MHz and the Sacajawea CPU used in the 32SII runs at 640kHz. (Edited to fix typo on the 32SII frequency) RE: Emulator vs simulator performance - Christoph Giesselink - 06-10-2020 01:53 PM (06-10-2020 12:16 PM)Didier Lachieze Wrote:(06-10-2020 07:57 AM)J-F Garnier Wrote: The 42S is supposed to run at 1MHz, does somebody know the speed of the 32SII CPU? These speed settings at Wikipedia are from me contributed many years ago. But I'm not sure about the values for the 1LU7 Bert and 1LR3 Sacajawea chips any more. First, the PCB for the Low-End Pioneers contain no crystal. The external component on a 1LU7 PCB is a capacitor and the external components on a 1LR3 PCB are a capacitor and a resistor. But how I get the "Authentic Speed" value for Emu42 emulating a HP32SII? In this case I gone a more practical way. I programmed Katie Wasserman 99 Digits of PI on an HP 32SII on a real HP32SII and measured the execution time. It took around the 11 minutes like mentioned in the article. With this measured time I adjusted the Sacajawea CPU cycles reference inside Emu42 to a value of 54, so that the real and emulated calculator needs more or less the equal time executing this program. What does this number 54 mean? Execute 54 CPU cycles in a 16384 Hz time frame, so 54 * 16384 Hz = 884736 Hz. BTW, the CPU cycles reference in Emu42 for a 1MHz Lewis CPU is 61 (61 * 16384 Hz = 999424 Hz). Can I proof this calculated frequency? This depends on the memory type the assembler code is running. On all memory devices which are connected directly to the Saturn bus, the CPU cycles inside Emu42 are correct. But when the memory device is accessed over a Saturn bus to 8 bit converter to access regular static RAM or ROM devices with 8 bit data bus interface, the CPU cycles inside Emu42 are wrong. In the last case you have the problem that the same opcode needs more cycles for a memory access and even more, the same opcode may have different cycles when the opcode is executed on an even or on an odd address. But was this mean in the case of the HP32SII? Both RAM and ROM are internal devices directly connected to the Saturn bus. So the used CPU cycles should be correct. As result I assume a CPU strobe frequency of about 884 kHz for the HP32SII now. RE: Emulator vs simulator performance - Jonathan Busby - 06-10-2020 08:32 PM (06-10-2020 07:57 AM)J-F Garnier Wrote: [snip] You're actually completely correct See this article Technically RPL is a "TIL" or "Threaded Interpreted Language" (06-09-2020 08:49 PM)Jonathan Busby Wrote: Indeed RPL on Saturn CPU's involves at least *three* levels of indirection a lot of the time Actually I misspoke : Although there are usually *three* levels of indirection with most simple RPL objects that eg. just push themselves to the stack, the third level of indirection is just the act of re-entering the RPL inner loop. Quote:What is the penalty of SysRPL compared to assembly language? Well, the "RPL inner loop" as implemented on Saturn based HP calculators, follows the following control flow, assuming the current object being executed in the runstream is a pointer to an embedded BINT object ( to demonstrate the various levels of indirection ) : Code: A=DAT0 A Code: =PRLG LC(2) 10 Code: dirbint D0=D0- 5 So, I initially misspoke as there are technically only *two* levels of indirection in most Sys-RPL words As for the Sys-RPL performance penalty, well, direct object execution involves *two* "PC=(A)" instructions. This means that a 5-nibble address has to be read from memory, and memory accesses on the Saturn CPU are notoriously slow. Also, it involves a 5-nibble absolute control flow jump as the PC is set to the address previously read from memory, and such absolute jumps on the Saturn CPU are slow, although not as slow as memory accesses Also, one must take into account all the other instructions that are executed when an RPL object is directly executed in the runstream, and this also adds a lot of overhead. ( EDIT : The reason for memory accesses being so slow on the Saturn CPU is not due to the Saturn CPU itself per se, but instead due to the Saturn Bus. In the original discrete HP71B Saturn chip, I believe that the Saturn Bus Interface was integrated onto the chip. On later Saturn based SoCs like the Yorke, the Saturn bus ran at half the speed of the Saturn CPU itself, which slowed down memory accesses by about 2x. This, though, is only one aspect of the Saturn bus which contributes to the slowdown. For an instruction like "PC=A", the Saturn CPU drives a "LOAD PC" command on the Saturn bus, which is then followed by a 5-cycle operation in which the CPU transfers the 5 nibbles of the new PC address and which the memory controllers load into their local PCs. There is then a command auto-switch to a "PC READ" command and a "dummy strobe" on the Saturn bus for memory pipelining. For an eg. "A=DAT0 W" instruction, first, the CPU issues a "LOAD DP" command onto the Saturn bus and then the CPU performs a 5-cycle operation in which it successively drives 5 address nibbles onto the Saturn bus which are latched by the memory controllers. There is then a command auto-switch to "DP READ", another 1-cycle "dummy strobe" and then the CPU reads 16 nibbles from the Saturn bus. So, for the "PC=(A)" instruction, you have 1-cycle for the "LOAD DP" command, 1-cycle for the dummy strobe, 5-cycles to read the data, another 1-cycle for the "LOAD PC" command, 5-cycles for the CPU to transfer the new PC address to the memory controllers and then, finally, a 1-cycle dummy strobe, for a total 14 cycles, not including instruction decode time. For the "A=DAT0 A" instruction, you have the initial 1-cycle "LOAD DP" bus command, 5 cycles to drive the 5 nibbles of the address, a 1-cycle dummy strobe and then 5-cycles for reading 5 data nibbles for a total of 12 cycles, not counting instruction decode time. If this is on the Yorke SoC, then the total cycle length associated with the Saturn bus is around 24 cycles as the Saturn bus on the Yorke SoC only runs at 2MHz. ) Quote:I tried to answer by porting Valentin's program to the HP-32SII, emulated in Christoph's Emu42. So the comparison is done at constant CPU speed. AFAIK, the 32SII Saturn runs at about 640KHz. Regards, Jonathan RE: Emulator vs simulator performance - Christoph Giesselink - 06-15-2020 08:54 PM Performance measurents with Emu71/Win until v1.11 J-F Garnier made some performance tests with Emu71/DOS and Emu71/Win and detected a performance catastrophe inside Emu71/Win until v1.11 (actual version) in connection with the BASIC GOTO command. Emu71/DOS is not affected. A loop with GOTO where you expected an execution time at full speed of ~0.08s takes ~9s, whereas the same loop at "Authentic Calculator Speed" takes ~20s. This behavior is caused by annunciator settings combined with the display annunciator update which takes quite long. TNX to J-F Garnier for this detection. The problem will be fixed in a next version, so take care when you do performance measurements with Emu71/Win until v1.11 with a massive use of GOTO loops. RE: Emulator vs simulator performance - Jake Schwartz - 06-16-2020 06:25 PM (06-10-2020 12:16 PM)Didier Lachieze Wrote:(06-10-2020 07:57 AM)J-F Garnier Wrote: The 42S is supposed to run at 1MHz, does somebody know the speed of the 32SII CPU? This is stuff that former-HPer Eric Vogel would have known right off the top of his head back in the day. He gave a nice presentation on the various Pioneer machines at a Chicago-area CHIP-group meeting back in April of 1991 around the time of the 32SII introduction, where he put chalk to blackboard and made a nice chart of all the ROM/RAM/CPU/display variants up and down the Pioneer series. Those were fun days. Jake RE: Emulator vs simulator performance - Jonathan Busby - 06-16-2020 07:12 PM (06-10-2020 12:16 PM)Didier Lachieze Wrote:(06-10-2020 07:57 AM)J-F Garnier Wrote: The 42S is supposed to run at 1MHz, does somebody know the speed of the 32SII CPU? Not to nitpick, but I think you made a typo with "640Hz" It should be "640KHz" If it was just 640Hz, then it would be just a little faster than Charles' Babbage's mechanical Analytical Engine Regards, Jonathan RE: Emulator vs simulator performance - Thomas Okken - 06-16-2020 07:23 PM (06-16-2020 07:12 PM)Jonathan Busby Wrote: Not to nitpick, but I think you made a typo with "640Hz" It should be "640KHz" If it was just 640Hz, then it would be just a little faster than Charles' Babbage's mechanical Analytical Engine Surely you mean kHz, as in 1000 Hz, and not KHz, as in Kelvin-Hertz? RE: Emulator vs simulator performance - Jonathan Busby - 06-16-2020 07:26 PM (06-16-2020 07:23 PM)Thomas Okken Wrote:(06-16-2020 07:12 PM)Jonathan Busby Wrote: Not to nitpick, but I think you made a typo with "640Hz" It should be "640KHz" If it was just 640Hz, then it would be just a little faster than Charles' Babbage's mechanical Analytical Engine HAHA! Seems I'm not the only one susceptible to making typos Regards, Jonathan RE: Emulator vs simulator performance - J-F Garnier - 06-18-2020 06:25 PM So I run the Valentin's problem on Emu71 too since the initial question was to compare a HP-71B emulator and Free42. Here are the results, still with 10,000 points and other defauts parameters: Emu71/Win 1.11 : 1min55s Emu71/DOS 2.45 run in VirtualBox : 1min58s (not too bad for an old 16-bit DOS program :-) (for reference, Free42 Decimal : 6.3s ) My first attempt was to adapt the 42S program with the same GOTOs, and so I identified the abnormal slow-down in Emu71/Win with GOTOs that Christoph reported above. Then I replaced the GOTOs with FOR..NEXT loops and other constructions (anyway using GOTOs is considered as a bad programming style) and the Emu71/Win performance was then similar to Emu42 emulating a HP-32SII. Here is my HP-71B program (without GOTOs): Code: 10 ! ------ The abnormal slowdown in Emu71/Win up to 1.11 occurs with GOTO/GOSUB and of course with statements directly or indirectly related to the annunciators: SFLAG/CFLAG/RADIANS/DEGREES/USER J-F |