Tripartite Palindromic Partition of Integer (HP 50g) Challenge
|
05-25-2023, 11:18 PM
Post: #120
|
|||
|
|||
RE: Tripartite Palindromic Partition of Integer (HP 50g) Challenge
Apologies for the delay, but writing a wall of text this large takes a while...
This series of posts is meant to grant a look inside my SysRPL submission, with an emphasis on optimization. Fundamentally, there are two categories of optimization tricks here: general SysRPL ones, and application-specific ones. The latter category obviously doesn't teach much for other projects, but since there are alternative implementations for this particular problem, they may have a use anyway, beyond the high-level lesson to explore your topic thoroughly for shortcuts. I also expect that you have a copy of the proof at hand for the application-specific tricks, since they are kind of math-heavy and tie into the its arguments, use its definitions, reference its dispatching cases, and so on. Okay, let's dive into the source code, file by file. The first one is dedicated to a general SysRPL trick, which I lifted from Nosy: building a library with MASD directives instead of the separate CRLIB command. I have an unfinished utility library to assist with this, so I'll reserve the details for when I add the missing bits of input validation and customization so I can release it. But as the Nosy readme indicates, it lets you embed the code of one library command inside another's (with the condition that the embedded one can't be one of the named commands, due to library structure limitations), thus eliminating the ROMPTR you'd otherwise use in the spot you embed the code. A ROMPTR obviously uses up some memory, but the more important consideration is that it takes a substantial amount of time on execution, so you generally replace the call site with an embed where you expect the most "traffic", so to speak. A separate benefit from this trick is that you always compile all pieces of the project in one go. No more accidental mixing new versions of some commands with outdated versions of others, just because you forgot to recompile them before calling CRLIB. Not a classical optimization, but by preventing mistakes, I'll call it an optimization anyway: one for developer time. Since the tool to auto-generate and compile the code tying all the individual commands together isn't published just yet, I've bundled a copy of that generated code (lightly edited, because otherwise the library checksum needs to be adjusted after compiling) into the source. Filename: compile Code: !NO CODE Code: Sum3Pal: split a number into a sum of 3 palindromes in base-n > 4 Code: 859. Code: { Sum3Pal } Code: { ck1z ZINTtoDigits DigitsToZINT AlgSList subAsZINTthenAlgS AlgS5Callback pickType BINTmod AlgMain makeEvenLenPal makeOddLenPal AlgIIandIIIcarry AlgIVfinish2and4ii } Next up: the actual entry point from UserRPL (i.e. user programs and the command line) into this library. Since I merely chose the library structure for easy subroutines accessible from multiple other subroutines tucked away deeply in the call graph, the library has only this one visible command. Many SysRPL programs expected to be called from UserRPL use CKn&Dispatch (where n is a digit from 1 to 5) to simultaneously check count and type of their arguments, and additionally register the currently running ROMPTR as the command "blamed" in case of an error (you'll see the name in the error message window, in front of the word "Error:", when the message itself comes after). These are useful commands because they do several useful things for the cost of a single command and a handful of BINTs, particularly for "overloaded" commands like + which have multiple separate meanings depending on the types of their parameters (e.g. concatenate strings, concatenate lists, append objects to lists, add numbers). There are alternatives, though. A group of CAS entry points named ^CK1Z, ^CK2Z, and ^CK3Z show the way: pick a type (ZINT for these), and attempt to convert whatever the user passed into that type. On failure, complain about "Bad Argument Type". There's just one shortcoming for my use-case: When dealing with numbers in non-decimal bases - one of the arguments taken by Sum3Pal sets the base it operates in - supporting the HXS type would be desirable, but ^CK1Z and friends only accept ZINTs, reals, and (for some weird reason) strings that can be parsed into ZINTs. As we discovered recently in another thread, there isn't even a direct way to convert a HXS to ZINT without writing the ASM code for it yourself. My solution is a wrapper around ^CK1Z which converts a HXS to one of the other accepted types: since reals are potentially lossy, I went with strings as the intermediate format in this indirect conversion. Unlike Gerald's solution in the linked thread, mine simply decompiles the HXS with a word length of 64 in decimal mode and chops off the leading "# ". This fails for HXS longer than 64 bits, but the rest of the system doesn't really support them either, so I find that an acceptable sacrifice. In case you're wondering about the label at the start of each source file, the library-in-MASD technique depends on these to calculate the offsets from the fixed parts of the library structure to the corresponding code. Having to maintain these is not ideal, but unavoidable - luckily, when I omit or forget to change the label in a renamed file, I get a fairly informative MASD error, so it's not a big problem. They are always in the format <filename>.s, so I'll omit the filename for the code blocks representing source files. The real fun begins inside fromZINT obviously, but the arguments aren't just type-checked here - their order on the stack changes, and I squeezed some code in here which makes a copy of the base and converts it to a BINT (which we'll need later). The method to arrive at such an optimization is merely a second pass over the code - I had the conversion inside fromZINT previously, where I PICKed the number in ZINT format, converted it, and UNROLLed the result to its proper position (there was some more reorganization in there afterwards, but you get the point), and the obvious question was: is there a spot where the ZINT is conveniently located in stack level 1, and is there something to be gained from moving the conversion to that place? The answer was yes to both, and move it I did. That conversion to BINT also checks its value range. The errors aren't quite what I'd prefer (that would be "Bad Argument Value" which would be in line with other non-CAS commands, instead we get "Negative integer" or "Integer too large"), but oh well, I'll take the free range checks. The upper limit of 9999 even gives me a good amount of headroom so I can do some arithmetic up to a few multiples of the given base without running into integer overflow problems. Code: LABEL Sum3Pal.s Code: LABEL ck1z.s Other than sacrificing pointless consistency in the call interface of subroutines intended for internal use only, the only lesson to be learned from these is to try several different orders of arguments to see which is the best one - in this case, ZINTtoDigits benefits from keeping the base in level 2 and the number to be converted in level 1, because even though ^Mod needs them the other way around (it needs to work on copies anyway, so we can fix the order for free by combining the 2DUP creating those copies with a SWAP), updating the remaining ZINT in each loop iteration (lopping off the least significant digit using ^Div) is easier with it in level 1. The base has to be kept around unmodified until the end, so we can just PICK it each time, and that needs one command regardless of level (as long as it's fixed and at most 10). Since there aren't more temporary items deposited on the stack at the time it's needed, we need 2 PICK, so let's recall the awesome signature of fellow forum member HP67: It ain't OVER 'till it's 2 PICK (yes, I had to incorporate that one somewhere ) and turn it into OVER. Code: LABEL ZINTtoDigits.s Code: LABEL DigitsToZINT.s The comments describing all the pieces of the environment may be somewhat cryptic without the context they get used in. I wrote them primarily as a reference for myself to tell which one I need to use when writing the rest of the code. LAM1 is clearly the base given by the user (the only non-obvious aspect is that it's a BINT), LAM2 is half-explained above (I'll explain the subroutine's code separately below), and for the rest you'll have to search for their context. There's also some error checking in here. In retrospect, I should have moved it up into the Sum3Pal source file with the rest of the error checks, but in terms of optimization it doesn't make a difference. I'm not changing it now, you get the code as submitted. Much of the work to solve the challenge is hidden in the lines from the AlgSList call to the pickType call. Those lines use the digit count to extract the fitting subroutine from a list, and if it exists it's EVALed, else you get pickType as a fallback. This is a fairly decent rendition of a C-style switch() statement; if the non-default cases weren't consecutive, something based around Lookup instead of NTHELCOMP would be needed, but the benefits of omitting BINTnn #=casedrop (2 commands per case if referencing a builtin BINT) are reduced because you would need the BINTnn back. Those lines receive the input digits meta as argument and are supposed to give back three lists. Around them, these arguments are converted from and to ZINTs. I use lists instead of metas here because as single objects they are easier to juggle around, and there's another few places where they need to be juggled, most of them connected to subAsZINTthenAlgS. Code: LABEL fromZINT.s Calling one of these subroutines from another in these places is a no-brainer - but it has some implications for how I can return results with less than the full three palindromes. Just omitting the unneeded ones is a cardinal sin, of course (it doesn't give any way to tell at what stack level the non-palindrome items start). The actually viable options are a list or meta to bundle them up, or placeholder items for the unneeded ones. I found it to be more efficient if I use the latter option, with the appropriate representation of a zero as the placeholder. The otherwise unremarkable subroutine AlgS1 for 1-digit numbers showcases this: convert the meta on the stack (consisting of the digit and BINT1 as the digit count) into a list, and pad it with two lists containing one BINT0 each. Early on I determined that I wanted a valid digit in the placeholder, not just an empty list. That was before I switched the user-facing output format from digit lists to ZINTs, so an empty list would have been a user-visible inconsistency: AlgS1 (and some others) can produce a list containing a single BINT0 as one of their primary outputs too, alongside placeholders. The later switch to ZINT output format can hide this inconsistency, but I left this small optimization potential behind in case there's a need to hand the digit-list format back to the user again. Utilizing this potential would also have ripple effects throughout the higher-order "short" / "small" subroutines (because they usually pick apart the result of the call to lower-order ones to check for a third palindrome instead of checking all the conditions beforehand; I'll get back to that below), so it's a fairly invasive change with questionable benefits. Code: LABEL AlgSList.s Code: LABEL AlgS1.s The remaining two cases share much of their code, but \(\delta_1>\delta_0+1\) needs some massaging to get the values in line with the other case's expectations. Comparing the cases in the proof will easily reveal that the compensation needs to replace \(\delta_1\) with \(\delta_1-1\) and \(\delta_0\) with \(g+\delta_0\), so that's what the other nested secondary does. (Remember that the base we operate in, which the proof calls \(g\), is stored in LAM1.) Sharing code between similar cases with an IT or ITE to take care of the differences will be a common sight in the other subroutines, better get used to it now; we'll also see some more messy instances of code sharing, including some runstream and return stack manipulation. Code: LABEL AlgS2.s Note also that the \(g-1\) digit needed for the palindromes in four cases gets held onto throughout a series of checks. That's some minor recycling; there are numerous opportunities for that throughout this challenge. In several places it saves me the disposal of a "superfluous" item on a stack as well as putting the required item back on the stack where I need it, and all I need to do for it is some extra stackrobatics, alongside a proof that it's guaranteed to have the value I need. (That's easy here since I generate it from scratch; in other places I recycle leftover digits, and I need to ensure there isn't a corner case where they deviate from the value normally seen.) Code: LABEL AlgS3.s Its job revolves around handling those cases where one of the AlgS subroutines calls another to build up part of its result, but that description doesn't tell you much about what its arguments or results look like. As a compromise, the name describes how it achieves its objective: taking a number in meta form on the stack (which is generally a snippet of the input number) and another number as a single BINT, which is produced by a callback taken from the runstream, it converts both to ZINTs, calculates the difference, converts that back to a meta representing the digits, and finally pulls the appropriate subroutine out of AlgSList and EVALs it. (Using COLA_EVAL instead of plain EVAL is just a matter of reducing the return stack depth while the subroutine is running - a free micro-optimization. In recursive programs this could turn into an important piece of the puzzle though: COLA implements what functional programming languages call "tail recursion".) During the callback, the stack contains just the ZINT equivalent of the meta on level 1 and the ZINT form of \(g\) on level 2. (By the way, do you see how \(g\) is passed through from DigitsToZINT to ZINTtoDigits? That's not coincidence, it's iterative optimization applied to those two helpers to make it happen!) Beyond that, the stack contents are whatever the caller had on the stack above the meta. In many places calling subAsZINTthenAlgS, the callback makes use of that stack layout: for instance, a simple ROT as callback pulls down the item that was just above the meta and uses it as the number to subtract from what the meta represents. Note also that the round-trip to ZINT and back not only tolerates leading zeroes, it even strips them out. That's how the lookup in AlgSList can yield a subroutine corresponding to a number length significantly shorter than the length of the meta. Thus, there's no danger of accidentally calling e.g. AlgS3 on a 1-digit number with two leading zeroes (which its algorithm isn't prepared to handle); in this example there will be a 1-element meta after ZINTtoDigits is finished, and AlgS1 will be called instead. You may remember that the virtual stack is supposed to contain the digits of the number under examination. This isn't quite true during the subroutine call here (it contains the original number's digits, not the inner one's), but it doesn't matter because none of the AlgS subroutines touch the virtual stack. There's simply no need to correct its contents and restore it to the original afterwards for the outer number (or more efficiently, push another virtual stack and drop it afterwards). In case you're wondering why I ignore the virtual stack: inside each of the AlgS subroutines, the digit count is a fixed number (and a small one to boot), so it's more efficient to juggle the digits on the normal stack than picking them from the virtual stack could ever be. Callbacks are a powerful pattern for optimization via code sharing, regardless of programming language: Since you're getting to execute arbitrary code, you can do basically anything inside the callback. Additionally RPL makes it easy to use this pattern, since it breaks down the barriers between code and data - more than most modern languages do, even the functional ones (except maybe its ancestor LISP), since you can just treat a list as a program and vice versa. SysRPL amplifies this power even further with runstream and return stack manipulation: you can fix up the shared code from inside the callback. AlgS5Callback has an example for that. Code: LABEL subAsZINTthenAlgS.s In 4-digit numbers referring back to the smaller cases via subAsZINTthenAlgS plays a very central role. Case i is the one actually doing that, taking the most significant digit \(\delta_3\) and attempting to split off \(\delta_{3}00\delta_3\) as the biggest palindrome. The other cases handle various failures in doing that: iv and v come into play when subtracting \(\delta_{3}00\delta_3\) would go negative, so I sort them out before the subtraction. ii and iii are the ones where the difference breaks down into three palindromes instead of the usual two (or rarely, just one). To find these I just optimistically break it down and check if the third one (here's where keeping them sorted by magnitude shines) is a placeholder. If yes, leave it on the cutting room floor and get out of there. The meta fed into subAsZINTthenAlgS contains the three digits \(\delta_2\delta_1\delta_0\), and the callback pulls \(\delta_3\) down. That's about as straightforward as a sample use of subAsZINTthenAlgS can get. Distinguishing ii and iii is then done by taking a hard look at the proof's figures for the 3-digit case \(201\) and the 2-digit case \((\delta+1)\delta\) and picking a difference between them which can be efficiently tested for. The third palindrome has been used up to check if it was a placeholder (and its value was \(1\) either way), leaving \(\delta\delta\) and \((g-1)\) for ii, or \(101\) and \((g-1)(g-1)\) for iii. I settled for checking the bigger palindrome's second digit (\(0\) for ii, \(\delta\ge 1\) for iii), but reading the code again just now I see a better alternative: instead of blindly extracting only the first digit of the smaller palindrome (\(g-1\) for both, which I can recycle - by extracting it before the decision the extraction becomes shared code), I could convert that smaller palindrome into its meta form, use its length for the decision, and DROP the extra \(g-1\) at the start of case ii. That can be done by replacing CARCOMP OVER TWONTHCOMPDROP_ #0=case with INNER#1= NOTcasedrop (saving two commands, i.e. 5 bytes). You see, at this level of optimization even someone with a good amount of practice can miss a trick. It's not a trivial business, after all. ii has three unnamed sub-cases, two of which are eerily similar (those for \(\delta_3=1\) and \(\delta_3=g-1\)). Using shared code for these is an obvious move - and mathematically I could apply the same template to the third sub-case as well, but I've set a limitation for myself: no deviation from the proof's results, even if it makes the code better. iii is divided into named sub-cases iii.a and iii.b, and the former in turn has two unnamed sub-cases. The first of those is basically identical to iii.b (apart from the exact digits the input number had, which are irrelevant at this point), so I want to get these handled by the same code. As it turns out, the test that separates the sub-cases of iii.a also works correctly for iii.b because with \(1\le\delta\le g-2\) and \(\delta_3=1\) (the condition for the other sub-case of iii.a), the condition for iii.b could never be satisfied: \(\delta_3+\delta\le 1+(g-2)=g-1\) is clearly less than \(g\), therefore it can't be \(g+\delta_0\) for a valid (i.e. non-negative) digit \(\delta_0\). That is to say, testing for the condition distinguishing the sub-cases of iii.a is sufficient. The absence of code comments mentioning iii.a and iii.b is therefore explained by them using the exact same code, which I simply gave the label representing both together, iii. One of these unnamed sub-cases recycles an entire palindrome at once, \(\delta\delta\). Assuming you can call that "recycling", that is. You could also call it "fixing up the other palindromes around it". Code: LABEL AlgS4.s That creates other problems though: subAsZINTthenAlgS wants to convert the callback's result from BINT to ZINT, and the more significant digit is also supplied as a BINT on the stack. I circumvented both at the same time with a trick that's either very elegant or a massive hack, depending on viewpoint: knowing the exact code of subAsZINTthenAlgS, I see that the command to convert from BINT to ZINT will be the first element in the runstream belonging to the top entry of the return stack while the callback runs. I can consume that and simultaneously put it to good use on that single digit with 'REVAL. The 4PICK before it fetches the digit to be converted, and the rest of the callback appends the other digit, which is always \(1\). Such return stack manipulation is quite mind-bending, and it can easily become a maintenance nightmare. I spent a significant fraction of this callback's source on a comment documenting what is going on, just to avoid that outcome. And yes, the comment contains a typo (capitalization of subAsZINTthenAlgS). Deal with it. The callback has its own source file instead of residing inline in the parent subroutine because it's used in two separate places in AlgS5. (Is code reuse getting out of hand when not just the callback-using functions, but even the callbacks passed to them are shared? I think not, it's just comprehensive optimization.) Code: LABEL AlgS5Callback.s That is the only notable hole left in version 2. Apart from a cluster of typos in cases v.d and v.e of 6-digit numbers starting with \(1\) (which avoids invalid output by pure luck; I'll get back to that below) all the other problems are quite minor (like a handful of wrong input digits listed in some figures, missing explicit definitions for some variables used, some phrasing in Algorithm V that fails to reflect changes done between versions 1 and 2, ...) and can generally be resolved by any reader who notices them with relatively little effort. The case structure of the proof's handling of 5-digit numbers is built around two attempts to split off a 5-digit palindrome and each time hoping that the rest can be represented by two palindromes. Since all 5-digit numbers starting with anything other than \(1\) are handed over to Algorithm I, the outermost digits of the 5-digit palindrome have to be \(1\) as well. That leaves the second and fourth digit (which must have the same value to make a valid palindrome), and the central third digit, i.e. two independent digits. The various cases result in a variety of combinations for them with little in common between them; I managed to share the code snippet building the 5-digit palindrome anyway by constructing it last, outside the cases, from the two parameter digits. Therefore the cases are dispatched inside a long embedded secondary. Exiting it means jumping to the snippet which builds the 5-digit palindrome; the two "failure" cases (iv and v) which don't use a 5-digit palindrome remove this snippet from the return stack with RDROP. (These also have more in common than just that, but the code to construct their common pieces is so short and their positions in the branching structure of the code so awkward that I couldn't find an actually worthwhile way to make them share code. You can take this as a reminder that you need to evaluate whether an "optimization" is an improvement or actually detrimental.) The first attempt is with the palindrome \(1\delta_{3}0\delta_{3}1\), where i poses as the "success" branch, ii, iii, and iv take care of the 3-palindrome remainders, and v, vi, and vii treat the would-be negative remainders. Most of the latter move on to the second attempt: \(1(\delta_3-1)(g-1)(delta_3-1)1\). v is the one number where constructing that second attempt's palindrome isn't possible due to an invalid digit (\(delta_3-1\) would be negative). In this second attempt, vi is the "success" branch, and vii takes care of the few possible failures: the subtraction can't produce a negative number this time, and the 3-palindrome remainders are limited to the 2-digit template \((\delta+1)\delta\). Just like in AlgS4, both attempts at using subAsZINTthenAlgS are optimistic, as in: "failure" branches covering a 3-palindrome remainder are sorted out afterwards. For the first attempt (the second attempt doesn't need to handle this), telling 3-palindrome remainder \(201\) (case ii) apart from the others can again be optimized further: replace UNROTOVER TWONTHCOMPDROP_ #0=case with UNROT INNER#1= NOTcasedrop and delete the nearest CARCOMP after that (i.e. the one between ii and iv) to save one command. Case iii has sub-cases iii.a and iii.b, but they merely differ in their input digits, which are irrelevant as far as my code is concerned. Code: LABEL AlgS5.s |
|||
« Next Oldest | Next Newest »
|
User(s) browsing this thread: 14 Guest(s)