I've got an large bunch of legacy code in an old self-conceived scripting language that we compile/translate into javascript.
That language has a conditional jump, jumping to a label. Difference to common goto statement is, that no backward jumps are possible. There are no nested if statements nor loops in that language.
As goto does not exist in javascript, I'm looking for an algorithm that transforms goto mylabel
and mylabel:
into semantically equivalent structure.
I thought of using ifs
but found it not trivial because of the arbitrary nesting of the goto labels.
Example:
if cond1 goto a
do something1
if cond2 goto b
do something2
a:
do something3
if cond3 goto c
do something4
c:
do something5
b:
Could be rewritten as:
lbl_b=false;
lbl_c=false;
lbl_a = cond1;
if (!cond1) {
do something1;
lbl_b = cond2;
if (!lbl_b) {
do something2;
}
}
if (!lbl_b) {
do something3;
lbl_c = cond3;
if (!lbl_c) {
do something4;
}
do something5;
}
However, I was not able to derive a general algorithm from that.
This is usually called Goto Removal, we had just once a student work where the task was to implement it for C. In general you have to work with loops (sadly we did not put that work online). But as you have the restriction that you can only jump forward it is relatively easy:
Parse once over all lines and collect all labels. Create for every label a flag "skip_to_label". Initialize at beginning all flags to false. When you meet the conditional goto for label X you now prepend every single line , up to the label line with "if not skip_to_label" and set the flag to true.
This should be already enough and work, but is of course not very optimal.
How you can optimize it: Instead of prepanding the if, just maintain a set of flags for every line, and instead of setting something to false, just add for the lines the corrosponding flag in the set.
Now you can make the if for a group that contains all lines, where the set does not change, and the condition are the boolean flags of the set.
Example with your given code:
Now you write in front of each line either the if(s) or you start at the top and make an if block as long as the set remains the same.
So when you start you get your first at empty, its a conditional goto so instead you set your flag
now the set changes, and you introduce your block with the if of the set:
next change in set, so new if block:
and so on (I guess you get now the idea).
EDIT: As one can nicely see with the sets in the example it is in general not possible to model it with nested ifs, as e.g. the lines with skip_to_a and the ones with skip_to_b overlap, but neither contains the other complete.
Compiling to another language is usually harder than necessary. A simpler method would be, not to compile to the other language, but to interpret the code in javascript. This way it would easily be possible to produce your goto statement with any kind of semantics you would like.
However if you do it like this, you would need to move all the parsing logic into your javascript code, which might be ugly to do. Another method would be to compile some easier interpretable format, i.e. bytecode, so that you can precompute everything you need from the parser, all label positions etc.
You could do something like tracking the goto state in a while loop, but it wouldn't look too pretty:
One alternative solution would be to make each label into a method containing the code from the start of that label to the beginning of the following label, followed by a call to the function generated for the following label.
The pro for this is that a goto can be replaced by a simple method call. The drawback is that, for long scripts or loops you may end up with rather large call stacks.
Using this method, a simple algorithm would be:
This may end up causing additional problems. For example, what of scope for variables? But, at least it is an alternative approach which I hope should get your mind started along more tracks. ;)