thanks for taking the time to read this (sorry for the wall of text).
Background
Yes, this is a school project. No, I'm not asking you to do this for me. I've done all the work myself I've just been stumped by Java itself, not so much the problem.
The project is about the basics of genetic programming. I randomly generate trees (which represent basic mathematical expressions), evaluate their fitness against data from the target function (x, y values), then manipulate the set by eliminating some that are really far off, mutating some ever so slightly, then performing a "crossover" operation on the two most fit expressions. The crossover is the issue. I have created an algorithm that works but it permanently changes the two original trees, which I want to keep around in case the crossover generates two trees that are way off.
To fix this issue, I looked into copying the tree structure, which turned out to be fairly difficult to achieve. I looked into a couple different methods, firstly serialization. After trying it out and reading a few examples I tried it myself and failed. I realized I don't have a good enough grasp of Java to use that technique. So I decided to go with a recursive copy constructor on the Node class, and have a simple one on the GPTree class (see below).
I can't explain what's going wrong, and it's especially hard if you can't see it, but bear with me. After spending a while debugging in Eclipse, the issue seems to start in the GPTree.crossover
method. Specifically the following statements:
Node t1Node = getRandomNode(t1.root);
Node t1NodeParent = t1Node.parent;
Node t2Node = getRandomNode(t2.root);
Node t2NodeParent = t2Node.parent;
When I look at the debugger information for the first two nodes (t1Node
and t1NodeParent
) for example, t1Node
's id will be (all numbers made up for the example) 18 and its parent node's id will be 22. Then, after the assignment to t1NodeParent
, t1NodeParent
's id will be 22 (like it's supposed to be), but neither of its left or right children will have the id of 18 (like the child's)! Because of this, the if
clauses below these assignments get messed up and then the trees just get messed up, too. Sometimes (not every time, oddly enough) the original trees will be changed as well.
I assume this has something to do with how I tried to copy the trees, since that's the only thing that has changed. But, for the life of me, I could not tell you why it's doing that.
So, thank you for reading this long-winded question . . . I hope you can help. :/
Here is my severely stripped down code that demonstrates my problem:
GPEnv.java (gist)
import java.util.ArrayList;
public class GPEnv {
private final int MAX_HEIGHT = 4;
private ArrayList<GPTree> trees;
public GPEnv(int numTrees) {
trees = new ArrayList<GPTree>();
for (int i = 0; i < numTrees; i++) {
trees.add(new GPTree(MAX_HEIGHT));
}
}
public void crossover() {
System.out.println(trees.get(0));
System.out.println(trees.get(1));
System.out.println();
/*
This commented version works, but it permanently alters the original trees.
GPTree[] res = GPTree.crossover(
trees.get(0),
trees.get(1)
);
*/
GPTree[] res = GPTree.crossover(
trees.get(0).copy(),
trees.get(1).copy()
);
System.out.println(res[0]);
System.out.println(res[1]);
System.out.println();
System.out.println(trees.get(0));
System.out.println(trees.get(1));
}
public static void main(String[] args) {
GPEnv env = new GPEnv(2);
env.crossover();
}
}
Expression.java (gist)
public abstract class Expression {
protected static enum Token {
PLUS("+"),
MINUS("-"),
TIMES("*"),
DIVIDE("/"),
NUMBER(""),
VARIABLE("x");
private final String symbol;
Token(String symbol) {
this.symbol = symbol;
}
public String toString() {
return this.symbol;
}
}
protected static class Node {
protected Token token;
protected Node parent, left, right;
protected double value;
public Node() {
this.parent = null;
this.left = this.right = null;
}
public Node(int number) {
this();
this.token = Token.NUMBER;
this.value = number;
}
public Node(Token token) {
this();
this.token = token;
this.value = Double.NaN;
}
private Node(Token token, double number, Node parent, Node left, Node right) {
switch (token) {
case PLUS:
this.token = Token.PLUS;
this.value = Double.NaN;
break;
case MINUS:
this.token = Token.MINUS;
this.value = Double.NaN;
break;
case TIMES:
this.token = Token.TIMES;
this.value = Double.NaN;
break;
case DIVIDE:
this.token = Token.DIVIDE;
this.value = Double.NaN;
break;
case NUMBER:
this.token = Token.NUMBER;
this.value = Double.parseDouble(number + "");
break;
case VARIABLE:
this.token = Token.VARIABLE;
this.value = Double.NaN;
break;
}
this.parent = parent;
this.left = left;
this.right = right;
}
public void setParent(Node rent) {
this.parent = rent;
}
public boolean isOperator() {
switch (this.token) {
case PLUS:
case MINUS:
case TIMES: // intentional fall-throughs
case DIVIDE:
return true;
default:
return false;
}
}
public String toString() {
if (this.token == Token.NUMBER) {
return this.value + "";
}
return this.token.toString();
}
public Node copy() {
Node left = null;
Node right = null;
if (this.left != null) {
left = this.left.copy();
}
if (this.right != null) {
right = this.right.copy();
}
return new Node(token, value, parent, left, right);
}
}
protected Node root;
private void postOrderTraverse(Node node, StringBuilder sb) {
if (node != null) {
postOrderTraverse(node.left, sb);
postOrderTraverse(node.right, sb);
sb.append(node + " ");
}
}
@Override
public String toString() {
StringBuilder sb = new StringBuilder();
postOrderTraverse(this.root, sb);
return sb.toString();
}
}
GPTree.java (gist)
import java.util.ArrayDeque;
import java.util.Deque;
import java.util.Random;
public class GPTree extends Expression {
private static final int MIN_HEIGHT = 2;
private static final int MAX_MUTATED = 4;
private static final Random rand = new Random();
private int maxHeight;
public GPTree(int maxHeight) {
super();
this.maxHeight = maxHeight;
generateRandomTree();
}
private GPTree(Node root, int maxHeight) {
this.root = root;
this.maxHeight = maxHeight;
}
private static Node getRandomNode(Node node) {
if (node.left == null && node.right == null) {
return node;
} else if (rand.nextInt(10) > 6) {
return node;
}
if (rand.nextInt(2) == 0) {
return getRandomNode(node.left);
} else {
return getRandomNode(node.right);
}
}
private void generateRandomTree(Node node, int depth) {
if (depth == maxHeight) {
node.left = new Node(rand.nextInt(10));
node.left.setParent(node);
node.right = new Node(rand.nextInt(10));
node.right.setParent(node);
return;
}
if (rand.nextInt(2) == 0) {
node.left = new Node(Token.values()[rand.nextInt(4)]);
generateRandomTree(node.left, depth + 1);
} else {
// give numbers an increased chance of occuring (60%)
if (rand.nextInt(10) > 3) {
node.left = new Node(rand.nextInt(10));
} else {
node.left = new Node(Token.VARIABLE);
}
}
if (rand.nextInt(2) == 0) {
node.right = new Node(Token.values()[rand.nextInt(4)]);
generateRandomTree(node.right, depth + 1);
} else {
// give numbers an increased chance of occuring (60%)
if (rand.nextInt(10) > 3) {
node.right = new Node(rand.nextInt(10));
} else {
node.right = new Node(Token.VARIABLE);
}
}
if (depth < MIN_HEIGHT && (!node.left.isOperator() && !node.right.isOperator())) {
if (rand.nextInt(2) == 0) {
node.left = new Node(Token.values()[rand.nextInt(4)]);
generateRandomTree(node.left, depth + 1);
} else {
node.right = new Node(Token.values()[rand.nextInt(4)]);
generateRandomTree(node.right, depth + 1);
}
}
node.left.setParent(node);
node.right.setParent(node);
}
public void generateRandomTree() {
if (this.root == null) {
this.root = new Node(Token.values()[rand.nextInt(4)]);
}
generateRandomTree(this.root, 1);
}
public static GPTree[] crossover(GPTree t1, GPTree t2) {
GPTree[] result = new GPTree[2];
Node t1Node = getRandomNode(t1.root);
Node t1NodeParent = t1Node.parent;
Node t2Node = getRandomNode(t2.root);
Node t2NodeParent = t2Node.parent;
t2Node.parent = t1NodeParent;
if (t1NodeParent == null) {
t1.root = t2Node;
} else {
if (t1NodeParent.left == t1Node) {
t1NodeParent.left = t2Node;
} else {
t1NodeParent.right = t2Node;
}
}
t1Node.parent = t2NodeParent;
if (t2NodeParent == null) {
t2.root = t1Node;
} else {
if (t2NodeParent.left == t2Node) {
t2NodeParent.left = t1Node;
} else {
t2NodeParent.right = t1Node;
}
}
result[0] = t1;
result[1] = t2;
return result;
}
public GPTree copy() {
return new GPTree(root.copy(), maxHeight);
}
}
It appears that you are not attempting to fix up the parent links when you make the copy.
Imagine a basic expression 1+2. When you copy the '+', say id 22, you create a new '+' node with copies of its two children and a reference to its original parent (null). Say the original '1' node has id 5, and the newly-created '1' node has id 18. You preserved the parent link back to 22 when you built your new '1' node, because when making the copy you did not yet have access to the new parent.
A way out of this problem when deep-copying circular data structures is to provide a special copy operation that takes the new parent as a parameter. The deep copy should call this special copy with the incomplete new parent node for each of its children.
Something like
Node.copy()
method - you could actually name itName.clone()
and then make whole class implementCloneable
. What is more: you are cloning and setting children but you forget that those children would have new parent. After cloning you should set parent to the new ones, e.g.:setParent(this)
method.Without it, you are just creating new nodes that are using your cloned nodes' parent as their parent, which might be the very reason things go wrong.