I've been learning f# in the previous days, writing a small project which, at last, works (with the help of SO, of course).
I'm trying to learn to be as idiomatic as possible, which basically means that I try to not mutate my data structures. This is costing me a lot of effort :-)
In my search for idiomatic functional programming, I have been trying to use as much as possible lists, tuples and record, rather than objects. But then "praticality beats purity" and so I'm rewriting my small project using objects this time.
I thought that you could give me some advice, surely my idea of "good functional programming design" is not yet very well defined.
For instance I have to modify the nodes of a tree, modifying at the same time the states at two different levels (L and L+1). I've been able to do that without mutating data, but I needed a lot of "inner" and "helper" functions, with accumulators and so on. The nice feeling of being able to clearly express the algorithm was lost for me, due to the need to modify my data structure in an involved way. This is extremely easy in imperative languages, for instance: just dereference the pointers to the relevant nodes, modify their state and iterate over.
Surely I've not designed properly my structure, and for this reason I'm now trying the OOP approach.
I've looked at SICP, at How to design programs and have found a thesis by C. Okasaki ("Purely functional data structures") but the examples on SICP and HTDP are similar to what I did, or maybe I'm not able to understand them fully. The thesis on the other hand is a bit too hard for me at the moment :-)
What do you think about this "tension" which I am experiencing? Am I interpreting the "never mutate data" too strictly? Could you suggest me some resource?
Thanks in advance,
Francesco
When it comes to 'tree update', I think you can always do it pretty elegantly
using catamorphisms (folds over trees). I have a long blog series about this,
and most of the example code below comes from part 4 of the series.
When first learning, I find it best to focus on a particular small, concrete
problem statement. Based on your description, I invented the following problem:
You have a binary tree, where each node contains a "name" and an "amount" (can
think of it like bank accounts or some such). And I want to write a function
which can tell someone to "steal" a certain amount from each of his direct
children. Here's a picture to describe what I mean:
alt text http://oljksa.bay.livefilestore.com/y1pNWjpCPP6MbI3rMfutskkTveCWVEns5xXaOf-NZlIz2Hs_CowykUmwtlVV7bPXRwh4WHJMT-5hSuGVZEhmAIPuw/FunWithTrees.png
On the left I have an original tree. The middle example shows the result I want if node
'D' is asked to steal '10' from each of his children. And the right example
shows what the desired result is if instead I asked 'F' to steal '30' in the original example.
Note that the tree structure I use will be immutable, and the red colors in the
diagram designate "new tree nodes" relative to the original tree. That is black
nodes are shared with the original tree structure (Object.ReferenceEquals to one
another).
Now, assuming a typical tree structure like
type Tree<'T> = //'
| Node of 'T * Tree<'T> * Tree<'T> //'
| Leaf
we'd represent the original tree as
let origTree = Node(("D",1000),
Node(("B",1000),
Node(("A",1000),Leaf,Leaf),
Node(("C",1000),Leaf,Leaf)),
Node(("F",1000),
Node(("E",1000),Leaf,Leaf),
Leaf))
and the "Steal" function is really easy to write, assuming you have the usual "fold"
boilerplate:
// have 'stealerName' take 'amount' from each of its children and
// add it to its own value
let Steal stealerName amount tree =
let Subtract amount = function
| Node((name,value),l,r) -> amount, Node((name,value-amount),l,r)
| Leaf -> 0, Leaf
tree |> XFoldTree
(fun (name,value) left right ->
if name = stealerName then
let leftAmt, newLeft = Subtract amount left
let rightAmt, newRight = Subtract amount right
XNode((name,value+leftAmt+rightAmt),newLeft,newRight)
else
XNode((name,value), left, right))
XLeaf
// examples
let dSteals10 = Steal "D" 10 origTree
let fSteals30 = Steal "F" 30 origTree
That's it, you're done, you've written an algorithm that "updates" levels L and
L+1 of an immutable tree just by authoring the core logic. Rather than explain
it all here, you should go read my blog series (at least the start: parts one two three four).
Here's all the code (that drew the picture above):
// Tree boilerplate
// See http://lorgonblog.spaces.live.com/blog/cns!701679AD17B6D310!248.entry
type Tree<'T> =
| Node of 'T * Tree<'T> * Tree<'T>
| Leaf
let (===) x y = obj.ReferenceEquals(x,y)
let XFoldTree nodeF leafV tree =
let rec Loop t cont =
match t with
| Node(x,left,right) -> Loop left (fun lacc ->
Loop right (fun racc ->
cont (nodeF x lacc racc t)))
| Leaf -> cont (leafV t)
Loop tree (fun x -> x)
let XNode (x,l,r) (Node(xo,lo,ro) as orig) =
if xo = x && lo === l && ro === r then
orig
else
Node(x,l,r)
let XLeaf (Leaf as orig) =
orig
let FoldTree nodeF leafV tree =
XFoldTree (fun x l r _ -> nodeF x l r) (fun _ -> leafV) tree
// /////////////////////////////////////////
// stuff specific to this problem
let origTree = Node(("D",1000),
Node(("B",1000),
Node(("A",1000),Leaf,Leaf),
Node(("C",1000),Leaf,Leaf)),
Node(("F",1000),
Node(("E",1000),Leaf,Leaf),
Leaf))
// have 'stealerName' take 'amount' from each of its children and
// add it to its own value
let Steal stealerName amount tree =
let Subtract amount = function
| Node((name,value),l,r) -> amount, Node((name,value-amount),l,r)
| Leaf -> 0, Leaf
tree |> XFoldTree
(fun (name,value) left right ->
if name = stealerName then
let leftAmt, newLeft = Subtract amount left
let rightAmt, newRight = Subtract amount right
XNode((name,value+leftAmt+rightAmt),newLeft,newRight)
else
XNode((name,value), left, right))
XLeaf
let dSteals10 = Steal "D" 10 origTree
let fSteals30 = Steal "F" 30 origTree
// /////////////////////////////////////////
// once again,
// see http://lorgonblog.spaces.live.com/blog/cns!701679AD17B6D310!248.entry
// DiffTree: Tree<'T> * Tree<'T> -> Tree<'T * bool>
// return second tree with extra bool
// the bool signifies whether the Node "ReferenceEquals" the first tree
let rec DiffTree(tree,tree2) =
XFoldTree (fun x l r t t2 ->
let (Node(x2,l2,r2)) = t2
Node((x2,t===t2), l l2, r r2)) (fun _ _ -> Leaf) tree tree2
open System.Windows
open System.Windows.Controls
open System.Windows.Input
open System.Windows.Media
open System.Windows.Shapes
// Handy functions to make multiple transforms be a more fluent interface
let IdentT() = new TransformGroup()
let AddT t (tg : TransformGroup) = tg.Children.Add(t); tg
let ScaleT x y (tg : TransformGroup) = tg.Children.Add(new ScaleTransform(x, y)); tg
let TranslateT x y (tg : TransformGroup) = tg.Children.Add(new TranslateTransform(x, y)); tg
// Draw: Canvas -> Tree<'T * bool> -> unit
let Draw (canvas : Canvas) tree =
// assumes canvas is normalized to 1.0 x 1.0
FoldTree (fun ((name,value),b) l r trans ->
// current node in top half, centered left-to-right
let tb = new TextBox(Width=100.0, Height=100.0, FontSize=30.0, Text=sprintf "%s:%d" name value,
// the tree is a "diff tree" where the bool represents
// "ReferenceEquals" differences, so color diffs Red
Foreground=(if b then Brushes.Black else Brushes.Red),
HorizontalContentAlignment=HorizontalAlignment.Center,
VerticalContentAlignment=VerticalAlignment.Center)
tb.RenderTransform <- IdentT() |> ScaleT 0.005 0.005 |> TranslateT 0.25 0.0 |> AddT trans
canvas.Children.Add(tb) |> ignore
// left child in bottom-left quadrant
l (IdentT() |> ScaleT 0.5 0.5 |> TranslateT 0.0 0.5 |> AddT trans)
// right child in bottom-right quadrant
r (IdentT() |> ScaleT 0.5 0.5 |> TranslateT 0.5 0.5 |> AddT trans)
) (fun _ -> ()) tree (IdentT())
let TreeToCanvas tree =
let canvas = new Canvas(Width=1.0, Height=1.0, Background = Brushes.Blue,
LayoutTransform=new ScaleTransform(400.0, 400.0))
Draw canvas tree
canvas
let TitledControl title control =
let grid = new Grid()
grid.ColumnDefinitions.Add(new ColumnDefinition())
grid.RowDefinitions.Add(new RowDefinition())
grid.RowDefinitions.Add(new RowDefinition())
let text = new TextBlock(Text = title, HorizontalAlignment = HorizontalAlignment.Center)
Grid.SetRow(text, 0)
Grid.SetColumn(text, 0)
grid.Children.Add(text) |> ignore
Grid.SetRow(control, 1)
Grid.SetColumn(control, 0)
grid.Children.Add(control) |> ignore
grid
let HorizontalGrid (controls:_[]) =
let grid = new Grid()
grid.RowDefinitions.Add(new RowDefinition())
for i in 0..controls.Length-1 do
let c = controls.[i]
grid.ColumnDefinitions.Add(new ColumnDefinition())
Grid.SetRow(c, 0)
Grid.SetColumn(c, i)
grid.Children.Add(c) |> ignore
grid
type MyWPFWindow(content, title) as this =
inherit Window()
do
this.Content <- content
this.Title <- title
this.SizeToContent <- SizeToContent.WidthAndHeight
[<System.STAThread()>]
do
let app = new Application()
let controls = [|
TitledControl "Original" (TreeToCanvas(DiffTree(origTree,origTree)))
TitledControl "D steals 10" (TreeToCanvas(DiffTree(origTree,dSteals10)))
TitledControl "F steals 30" (TreeToCanvas(DiffTree(origTree,fSteals30))) |]
app.Run(new MyWPFWindow(HorizontalGrid controls, "Fun with trees")) |> ignore
I guess if you start your sentence with " I have to modify the nodes of a tree, modifying at the same time the states at two different levels" then you're not really tackling your problem in a functional way. It's like writing a paper in a foreign language by first writing it in your mother tongue, then trying to translate. Doesn't really work. I know it hurts, but in my opinion it's better to immerse yourself completely. Don't worry about comparing the approaches just yet.
One way I've found to learn "the functional way" is to look at (and implement yourself!) some functional pearls. They're basically well document uber-functional elegant progams to solve a variety of problems. Start with the older ones, and don't be afraid to stop reading and try another one if you don't get it. Just come back to it later with renewed enthousiasm and more experience. It helps :)
What do you think about this "tension"
which I am experiencing? Am I
interpreting the "never mutate data"
too strictly? Could you suggest me
some resource?
In my opinion, if you're learning functional programming for the first time, its best to start out with zero mutable state. Otherwise, you'll only end up falling back on mutable state as your first resort, and all of your F# code will be C# with a little different syntax.
Regarding data structures, some are easier to express in a functional style than others. Could you provide a description of how you're trying to modify your tree?
For now, I would recommend F# Wikibook's page on data structures to see how data structures are written in a functional style.
I've looked at SICP, at How to design
programs and have found a thesis by C.
Okasaki ("Purely functional data
structures")
I personally found Okasaki's book more readable than the thesis online.
I have to modify nodes of a tree.
No you don't. That's your problem right there.
This is costing me a lot of effort
This is typical. It's not easy to learn to program with immutable data structures. And to most beginners, it seems unnatural at first. It's doubly difficult because HTDP and SICP don't give you good models to follow (see footnote).
I thought that you could give me some advice, surely my idea of "good functional programming design" is not yet very well defined.
We can, but you have to tell us what the problem is. Then many people on this forum can tell you if it is the sort of problem whose solution can be expressed in a clear way without resorting to mutation. Most tree problems can. But with the information you've given us, there's no way for us to tell.
Am I interpreting the "never mutate data" too strictly?
Not strictly enough, I'd say.
Please post a question indicating what problem you're trying to solve.
Footnote: both HTDP and SICP are done in Scheme, which lacks pattern matching. In this setting it is much harder to understand tree-manipulation code than it is using the pattern matching provided by F#. As far as I'm concerned, pattern matching is an essential feature for writing clear code in a purely functional style. For resources, you might consider Graham Hutton's new book on Programming in Haskell.
For instance I have to modify the nodes of a tree, modifying at the same time the states at two different levels (L and L+1)
Why? In a functional language, you'd create a new tree instead. It can reuse the subtrees which don't need to be modified, and just plug them into a newly created root. "Don't mutate data" doesn't mean "try to mutate data without anyone noticing, and by adding so many helper methods that no one realize that this is what you're doing".
It means "don't mutate your data. Create new copies instead, which are initialized with the new, correct, values".