I'm trying to manipulate ASTs in Rust. There will be lots of manipulations, and I want my trees to be immutable, so to save time all references will be Rc
s.
My tree nodes will look like this:
enum Condition {
Equals(Rc<Expression>, Rc<Expression>),
LessThan(Rc<Expression>, Rc<Expression>),
...
}
enum Expression {
Plus(Rc<Expression>, Rc<Expression>),
...
}
I want to replace a random node of a given type with another node of the same type. To do generic operations on trees I've made a trait:
trait AstNode {
fn children(&self) -> Vec<Rc<AstNode>>;
}
And all nodes implement this. This allows me to walk the tree without having to destructure each node type for every operation, by simply calling children()
.
I also want to clone a node while updating only one of its children, and leaving the other ones in place. Assume that I've been able to generate nodes of the right concrete type (and I'm happy for the program to panic if I'm wrong). I'll add the following method to the trait:
trait AstNode {
fn clone_with_children(&self, new_children: Vec<Rc<AstNode>>) -> Self
where Self: Sized;
}
My plan is to take the children returned by childen()
, replace one of them, and call clone_with_children()
to construct a node of the same enum variant but with one node replaced.
My problem is how to write clone_with_children()
.
I need to downcast Rc<AstNode>
to Rc<Expression>
(or what have you), while keeping the refcount inside the Rc
the same, but none of the downcasting libraries I've found seem to be able to do that.
Is what I want possible, or should I do it completely differently?
No, you can't downcast
Rc<Trait>
toRc<Concrete>
, because trait objects likeRc<Trait>
don't contain any information aout the concrete type the data belongs to.Here's an excerpt from the official documentation that applies to all trait objects (
&Trait
,Box<Trait>
,Rc<Trait>
):The
data
field points to the struct itself, and thevtable
field points to a collection of function pointers, one for each method of the trait. At runtime, that's all you have. And that's not sufficient to reconstruct the struct's type. (WithRc<Trait>
, the blockdata
points to also contains the strong and weak reference counts, but no additional type information.)But there are at least 3 other options.
First, you could add all the operations that you need to do on
Expression
s orCondition
s to the traitAstNode
, and implement them for each struct. This way you never need to call a method that isn't available on the trait object, because the trait contains all the methods you need.This also entails replacing most
Rc<Expression>
andRc<Condition>
members in the tree withRc<AstNode>
, since you can't downcastRc<AstNode>
(but see below aboutAny
):A variation on this might be writing methods on
AstNode
that take&self
and return references to various concrete types:Instead of downcasting
Rc<AstNode>
toRc<Condition>
, just store it as anAstNode
and call e.g.rc.as_condition().unwrap().method_on_condition()
, if you're confidentrc
is in fact anRc<Condition>
.Second, you could create another enum that unifies
Condition
andExpression
, and do away with trait objects entirely. This is what I have done in the AST of my own Scheme interpreter. With this solution, no downcasting is required because all the type information is present at compile time. (Also with this solution, you definitely have to replaceRc<Condition>
orRc<Expression>
if you need to get anRc<Node>
out of it.)A third option is to use
Any
, and either.downcast_ref()
orRc::downcast
(currently only on nightly) eachRc<Any>
into its concrete type as needed.A slight variation on that would be to add a method
fn as_any(&self) -> &Any { self }
toAstNode
, and then you can callExpression
methods (that take&self
) by writingnode.as_any().downcast_ref::<Expression>().method_on_expression()
. But there is currently no way to (safely) upcast anRc<Trait>
to anRc<Any>
, even though there is no real reason it couldn't work.Any
is, strictly speaking, the closest thing to an answer to your question. I don't recommend it because downcasting, or needing to downcast, is often an indication of a poor design. Even in languages with class inheritance, like Java, if you want to do the same kind of thing (store a bunch of nodes in anArrayList<Node>
, for example), you'd have to either make all needed operations available on the base class or somewhere enumerate all the subclasses that you might need to downcast to, which is a terrible anti-pattern. Anything you'd do here withAny
would be comparable in complexity to just changingAstNode
to an enum.tl;dr: You need to store each node of the AST as a type that (a) provides all the methods you might need to call and (b) unifies all the types you might need to put in one. Option 1 uses trait objects, while option 2 uses enums, but they're pretty similar in principle. A third option is to use
Any
to enable downcasting.Related Q&A for further reading: