I understood why Duff's device is faster than normal loop code which can be unrolled but is not optimized. But I can't understand how the code can be compiled yet.
I guess it's a trick about the switch syntax. But not anymore.
How can do while sentence exist in switch sentence? Very weird.
Is there anyone who can explain this?
Edit: Another question. Why did duff use 8? It could be 16, 65536 or whatever. Because of code size? Is there another reason? For example, cache or pipelining benefits.
The "how it works" is simple enough.
Both C and C++ are compiled languages, and normally compiled to the platforms machine code. Machine code has no concept of block structures - all block structures must be translated to a form that uses (in effect) some mix of unconditional and conditional gotos.
The C syntax rules allow the switch statement and loop to be combined in a way which is not a true hierarchical block structure, but which tangles control flow. So long as the compiler can cope with this (which any good compiler should) there is no problem in the underlying machine code. The result will be "spaghetti", but generated machine code that has been through an optimiser is always spaghetti - it's not meant to be human readable, so it's not an issue. The issue here is that the source code is spaghetti too, even though the gotos have been "hidden".
Note - although any good compiler should cope with Duffs device, as others have already commented, that doesn't mean it will cope well enough to optimise it properly - only well enough to generate correct executable code. This is one of these old strange idioms that once had a purpose, but which is more likely now to confuse your compiler and sabotage it's ability to generate efficient code.
EDIT
The following is related to Duffs device, and may help illustrate the basic idea...
Having case clauses inside the loop is conceptually no different to having goto-target labels inside the loop, as above.
Warning - this is purely for illustration of an idea, and should not be copied in real life code.
switch
is just a computedgoto
. So, there are several labels inside the loop, and aswitch
statement outside the loop. Theswitch
decides which label to go to, andgoto
s there, inside the loop.Once execution is inside the loop, it goes on looping until the loop relinquishes control.
It's actually very straightforward… and shouldn't be used unless it is the most straightforward alternative.
Don't believe anyone who tells you it makes anything fast.
I would even say to stop listening to anything they say at all, even if it's your teacher.
As for compilers, they break things down to generic control-flow graphs and don't care about
switch
vs.if
vs.while
. They are allif ( … ) goto …; else goto …;
to the compiler.A simple explanation of why Duff's Device compiles is that the syntax of the
switch
statement isn't particularly specific about the form that theswitch
statement block might need to take. There are a few restrictions, and a couple of things permitted in the controlled statement that aren't permitted outside aswitch
(thecase
anddefault
labels). But other than that, the controlled statement is just any other statement, with the likelihood that there are labels for theswitch
to target.Here's the syntax from C99:
Beyond the syntax, the standard imposes a few constraints:
case
label expressions must be integer constant expressionscase
label expressions ordefault
labelsOther than that, any construct permitted in a statement block should be permitted in the controlled statement (with the addition that
case
anddefault
labels are OK). Remember thatcase
anddefault
are just labels that the switch jumps to based on the controlling expression and thecase
label expressions. As Potatoswatter says,switch
is just a computedgoto
. So just likegoto
can jump into the middle of a loop, so canswitch
.Also, I think that the cases where you might see a benefit from Duff's Device are pretty rare today (I think they were rare even in the 1980's). Don't forget that Tom Duff himself said the following in his description:
Even more than when it was initially described, Duff's Device should be considered more of a curiosity than a tool to use.
While it's true that Duff's Device is outdated for its original purpose, it's still useful for special purposes, like a state machine that normally cycles repeatedly through
N
states, but sometimes needs to return to the caller and later be resumed at the state where it left off. Putting theswitch
statement outside the loop and thecase
labels inside the loop (I would take this as the definition of a "Duff's device") then makes a lot of sense.With that said, do not use Duff's devices to "optimize by hand". Putting what are effectively "goto labels" all over the place will not help the compiler optimize.
If we take the implementation from the Wikipedia article you link...
...and replace the "high-level"
do
/while
loop with the assembly-levelif
/goto
that the compiler really reduces it to......it may help you perceive that - in this usage where the do/while scope didn't introduce any local variables - there's really nothing more to the do-while than a jump back to case 0 that bypasses the switch and hence the need to evaluate count % 8 (% is a pretty expensive operation in the scheme of things).
Hope that helps it click, but may not...? :-)
Just a case of diminishing returns. Having to do the
--n > 0
check and a jump after every 8 data copies isn't a big percentage overhead, but the size of the code (both in source and as compiled code in cache) is still pretty tight. Maybe it'd be 90 or 95% work vs overheads, which was evidently good enough. Further, to illustrate and share the concept with others Tom Duff may have prefered it to be about a typical 80x25 terminal's worth of code rather than a page or 10.