It not a joke!!! It is the truth!!!

Giving people what they want: violence and sloppy eating

Previous Entry Share Next Entry
Insufferable but puzzled
mini me + poo
lovingboth
C is not my computer language of choice. My brain can cope with one set of conventions for what silly character means what, and Forth got there first (as well as being a much better language in lots of ways...)

So it's been some time - the 90s? - since I did anything involving pointers in C. (That cix program I mentioned Terry Pratchett using was it.)

The insufferable bit comes from having done something with them last night... and it working! First time! No crashes or anything!*

The puzzled bit comes from the way that the program got noticeably faster when I changed a bit of it that's done less than 1% of the time compared to how much it got faster when I changed the bit that's done over 99% of the time... in the same way.

Original algorithm: uses 8 'if' statements a lot, most of which fail to be true. It also does 8 array accesses each time. 355ms average over 1000 goes.

Second version: in the bit that's used 124/125ths of the time, replace the 'if' statements by additions and bitwise 'and's and make a simple adjustment to the final result to get the right answer. 315ms average over the same 1000 goes.

Pointer version: because I know how far apart the array accesses are, do it via a pointer instead. 307ms average.

So between them, that saved 48ms per go, 40ms of which was losing the ifs. (And about 110ms of what's left is displaying the result.)

It obviously won't make as much difference, but for the fourth version, replace the 8 ifs in the bit that's done 1/125th of the time. Because the array accesses are not as simple in those, leave them in...

... 269ms per go = 38ms saving!?!

How did losing the ifs 124/125th of the time save 40ms and losing them 1/125th of the time save 38ms? Was gcc doing better code for first version of the 124/125th bit?

If nothing else, I now know that having a series of failed 'if's on this ARM CPU must be really slow.

* Mind you, I did forget that - to save the lazy typists responsible for C and *ix a few keystrokes - when you add one to a pointer, the compiler thinks that you mean 'one lot of the size of the things you've told me I'm pointing at' rather than 'one'. But fortunately, I'm pointing at bytes, rather than anything bigger. Had I been pointing at words or anything else not one byte long, it wouldn't have worked.

This entry was originally posted at http://lovingboth.dreamwidth.org/544395.html, because despite having a permanent account, I have had enough of LJ's current owners trying to be evil. Please comment there using OpenID - comment count unavailable have and if you have an LJ account, you can use it for your OpenID account. Or just join Dreamwidth! It only took a couple of minutes to copy all my entries here to there.

  • 1
Depends on the ARM version - older/slower ones, as used in embedded applications, would flush the pipeline on a branch, taking three ticks, where most instructions normally executed in just one - hence the utility of almost all instructions being optionally conditional (up to the new 64-bit generation), so instead of something like:

CMP R1,R2
BNE hop
MOV R1,#44
.hop [rest of code]

- you'd use:

CMP R1,R2
MOVEQ R1,#44
[rest of code]

To find out the actual answer here, you'd probably need to delve into the disassembly, and see what's actually being executed.

  • 1
?

Log in

No account? Create an account