(50g) Savage SysRPL Revisited

11112018, 07:29 PM
(This post was last modified: 01082019 03:39 AM by DavidM.)
Post: #1




(50g) Savage SysRPL Revisited
(See this page for more information regarding the Savage benchmark on HP calculators)
(Edit: the SysRPL and Saturn assembly code included below has been edited for compatibility with the builtin MASD assembler [ASM] available on the 49G50g calculators) A straightforward UserRPL implementation of the Savage benchmark on a 50g usually looks something like this: Code: \<< Translating the above into SysRPL turns out to be very easy, given that most of those commands have nearly identical counterparts in a SysRPL context: Code: !NO CODE Recall that the emphasis of this particular benchmark is to determine the computation speed of the particular functions used, ie. x*x, SQRT, LN, EXP, ATAN, TAN, x+1. The vast majority of processing time for both of the above versions is spent in the numerical computations of those particular functions, so the SysRPL version is only able to gain about a 19% performance advantage in this case. That time savings results from two general optimizations in the SysRPL version: no type checking of data, and a slightly faster looping construct. While those two processing features can sometimes achieve a decent savings in SysRPL run times, they are minimal when compared to the time spent "number crunching" in these two programs. One of the other potential advantages of SysRPL implementations is the ability to perform calculations using the full 15digit internal representation of real values. Chain calculations such as those used in this benchmark are more likely to receive a benefit from this kind of treatment, so it makes sense to reimplement the SysRPL version with this in mind. As is the case for the first SysRPL implementation, translating the UserRPL code to an extended real SysRPL version is fairly simple, except for one particular function: there is no defined ATAN function for extended reals. Each of the other commands has direct counterparts, but a workaround has to be used for ATAN. In this case, the argument to that function is always positive, so an alternative can be used to compute ATAN using arctan(x)=arccos(1/sqrt(1+x^2)) (source: this comp.sys.hp48 post): Code: !NO CODE As expected, this version reduces the accumulated "error" of the final result, but unfortunately takes even longer than the UserRPL version to finish due to the extra computations required for the ATAN workaround. This is frustrating, especially since the internal computations performed by even the standard precision real functions were actually carried out to full 15digit precision internally before being rounded. The lack of an ATAN function for extended reals thus limits the ability to measure the true performance of the calculator. Edit: Since my original posting of the following programs, I've subsequently learned that I'm merely the latest participant in a party that started almost 20 years ago! Jonathan Busby had already gone through the same thought processes and come up with similar Saturn solutions to what I've presented below. His code targets the 48series as opposed to the 4950, but if you look you'll see that we essentially used the same approach (and nearly the exact same code). All credit is due to Jonathan for these ideas (though I promise I had not seen them prior to my post!). To remedy this, I propose the following alternative version of an extended real SysRPL implementation: Code: !NO CODE This version was written using a Saturn code object that simulates the equivalent of an %%ATANRAD function if it had existed. The entry points it uses are NOT supported, but they are at least consistent with both a v1.196 49G as well as a v2.15 50g. They stand a good chance of being in the same fixed locations on intermediate firmware versions, but I haven't attempted to verify that since I don't have any calculators with those firmware versions to test this on. The final result is of course the same as the previous extended real version. But note the execution time: nearly identical to the standard real SysRPL version. This is because the actual computations occurring in both programs are all carried out to 15 digits internally (though obviously with different intermediate values). The very slight performance improvement of the extended real version is due to the intermediate results not having to be rounded to 12 digits at each step. This rounding in the standard real SysRPL version takes a small but measurable amount of time (compared to the overall computation time). To satisfy my curiosity, I also implemented one last version of the benchmark, this time entirely in Saturn assembly: Code: !NO CODE This version is perhaps the best one to show how much time is spent performing actual numerical calculation for the benchmark (at least at the Saturn emulation level). Stack manipulation only happens once, and is limited to the very last step. Loop overhead is minimal and happens at Saturn speed, and all intermediate calculations are simply performed on the value currently stored in the CPU's A/B registers. This means that very little processing is performed outside the realm of numerical computation in this version, giving the purest view of how much time is spent on the calculations themselves (as opposed to stack manipulation and loop overhead). I believe these versions of the code give additional insight into the Savage benchmark running on a standard 50g, allowing better comparisons to be made with other platforms and configurations. (Note: all run times listed are an average of 5 runs of the specified code on my 50g) 

« Next Oldest  Next Newest »

User(s) browsing this thread: 1 Guest(s)