I have a small 8-bit processor which has a N-to-M decoder on some output lines - eg, for the 5 to 32 bit case, I write 00101 and bit 5 changes state. The only interface to the output is change-state, there is no read-back.
The device counts rapidly (but randomly) occuring events, and should provide this count as a 'single bit changes' code to another device. The output pins are read in parallel by another device, and may be read as rapidly or as sparingly as the other device decides, so the count is necessary.
I do NOT need to use the standard Binary Reflective Gray code - I can use any single-bit changing code.
However, I want to be able to track the next bit to change efficiently.
I do not have a "LowestBitSet" instruction, and finding lowest bit set across four 8 bit registers is cycle consuming - so I cannot use this "common" approach:
Keep binary counter A
Find B as A XOR (A+1)
Bit to change is LowestBitSet in B
I wish to calculate this in as little memory and registers as possible, and memory is definitely too restricted for any large lookup table. Cycle time is the more important factor.
Any suggestions on algorithms?
For Binary Reflective Gray Code, see this answer for an efficient way to calculate the code N.
XOR with the previous code to get a value where only the bit to change is set.
Then you can use this Bit Twiddling Hack (the case where "v is a power of 2") to find the bit index with only 3 operations and a 32-entry table.
The pseudo-code is something like this:
LowestBitSet(A ^ (A+1))
is always 0, unless you work for IBM. I think you meanHighestBitSet()
, which is roughly the same aslog_2()
.The bit-twiddling hack immediately preceeding the one linked by AShelly will be much more feasible on an 8-bit micro.
This should make your original algorithm fairly practical, generating
{ 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, ... }
.As for the possibility of changing to a different sequence which would also generate a Gray code, for the purpose of making it easier to compute, that's very interesting but I haven't come up with anything.
"Algorithm L" on page 10 of Knuth, Donald E. "Generating all n-tuples." The Art of Computer Programming, Volume 4A: Enumeration and Backtracking, pre-fascicle 2a, October 15, 2004 seems ideal. Step L4 would be "change_state(j)" for your device.
You don't need to calculate the Gray codes and xor them, you can just use the counter itself, and then use a 256-element lookup table to count the number of trailing zeros. Like this:
If you unroll the loop, this is at most 2 adds, 1 xors, and 5 loads (but almost always less). If you don't have 256 bytes for the table, you could use the same strategy on nibbles.
/*
As previously posted this answer, https://stackoverflow.com/questions/4648716#42865348 using a Johnson counter Gray code, is very simple:
which is executed on every event occurrence.
Otherwise, using a Binary Reflected Gray Code and a 4 byte base 2 counter
n
, ...Method 1 - using a table */
/Method 2 - no tables */
/*
Bit Bunnies and Cargo Cult Coders have no need to read further.
The efficiency of execution of the above code depends on the fact that
n[0], n[1], ...
are computed at compile time as fixed addresses which is quite conventional. Also, using a call-by-need optimizing compiler will expedite the array contents so only one indexed value needs to be calculated. This compiler sophistication is likely missing but it is easy to manually assemble the raw machine code to do it (basically a switch, computed goto, etc. ).Analysis of the above algorithms, using the orphaned code, shows that every function call will execute exactly the same instruction sequence, optimized or not.
In both methods, the non-orphaned returns require handling the case when there is counter roll over on 0, consequently using extra index and logical (
!
) operations. This extra happens for 1/2 of 1/2 of 1/2 or 1/8 of the total counts, and for one count in this 1/8 there is nothing to do but return 31.The first method requires 2 logical operations (
! ||
), 1 addition and 3 index calculations for 7/8 of the total counts. On a single count of zero, 3 logical and 3 index operations are needed and the rest of the other 1/8 need 3 logical, 1 addition and 4 index operations.The final code on execution of method 2 (compiled optimally), for 7/8 of the calculations, uses 7 logical operations (
|| & ! < -
the last is two's complement), 1 arithmetic (+
) and 5 calculated index operations. The other 1/8, but one instance, need 8 logical, 1 addition and 6 calculated index operations.Unfortunately, no flash of divine inspiration manifested any cargo code. This is an abridged tale of how this composition's authorship happened.
How this was done involved a crucial preliminary investigation as documented: https://stackoverflow.com/a/42846062. Then code was derived using a succesive refinement process commencing with an assesment of the algorithms in this post.
Specifically, this answer: https://stackoverflow.com/a/4657711 .
This algorithm's time variant execution from the loop overhead will be poignently and prominently accentuated by reducing the return calculation to a single addition and two index operations.
*/
/-------------------------------------------------------------- --------------------------------/
/*
There is a comparable chronology for method 2.
This sequence has been radically attenuated and abbreviated to an intermediate example:
Inadvertently, the code posted in https://stackoverflow.com/a/42865348 was not the exclusively byte sized one as intended. The intended code is in this post. */
/*
Rhetorically, the answer to How do I find ... can only be answered explicitly in the personal sense (see this answer: https://stackoverflow.com/a/42846062) as it is not possible to speak for other individuals' cognitive abilities.
The content of https://stackoverflow.com/a/42846062 is crucial for background information and reflects the very personal pedantic mechanism required for the mental gymnastics to solve this problem. Undoubtedly, the milieu and plethora of material is daunting but this is exactly the personal approach taken in garnering sufficient insight, repertoire, anecdotal precedents, etc. to extrapolate and interpolate an answer to answer specifically, what program meets the criteria, exactly. Further, it is this very "chase" that excites and motivates the (perhaps pathological) psyche to invest time and effort to satiate curiosity with an inquisitive quest. */
The algorithm posited by the OP does not generate any Gray code.
The algorithm in this answer: https://stackoverflow.com/a/4657711/7728918 is not constant time, since the conditional test
if (x)
can vary from 1 to 4 executions depending on the value ofcounter[i]
. This changes the amount of time that is required to calculate bit numbers. There will be 4 different possible execution times that any single calculation may have.See (cargo cult coders excepted) the reference for the rationale of the following that meets this constant time requirement (no lights, no car, not a single luxury ... not even a "table"):
However, there is a much better, more succint and expedient method to meet the OP's criteria.
For cargo cult coding purposes the following conveniently satisfies the OP's conditions minimally (maximally? ;).
The bit number to change is found each time with only two operations: a modulus (which if done modulo
2^n
could be as simple as a bit-wise&
operation withn-1
bits ie. the constant2^n - 1
) and an increment.The actual Johnson Gray code (JGC) sequence is generated incrementally by
XOR
ing the previous code with the desired bit, selected by a left shift of 1 to the bit number position. This calculation is NOT needed as per the OP's requirements.The Johnson Code
-------------------------
The actual Gray coding is irrelevant, so using a Johnson counter's Gray code is exceptionally trivial.
Note that the Johnson Gray code (JGC) density is linear and not logarithmic like base 2 or binary reflected Gray code (BRGC).
With 32 bits in 4 bytes, the sequence can count from 0 to 63 before resetting.
/./ / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /.
/./ / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /.
Test output:
partially generated with this code:
/*
Unfortunately, this answer does not explain "How do you (I, did I, ...) find ...".
For details on the methodology of finding such solutions and using a BRGC similarly ...
see previous ref.: https://stackoverflow.com/a/42846062/7728918