In Matlab you can designate the number of nodes in a dendrogram that you wish to plot as part of the dendrogram
function: dendrogram(tree,P)
generates a dendrogram plot with no more than P leaf nodes.
My attempts to do the same with heatmap2
in R have failed miserably. The posts to stackoverflow and biostars have suggested using cutree
but heatmap2
gets stuck with postings' suggestions on Rowv
option. Here "TAD" is the data matrix 8 columns by 831 rows.
# cluster it
hr <- hclust(dist(TAD, method="manhattan"), method="average")
# draw the heat map
heatmap.2(TAD, main="Hierarchical Cluster",
Rowv=as.dendrogram(cutree(hr, k=5)),
Colv=NA, dendrogram="row", col=my_palette, density.info="none", trace="none")
returns the message:
Error in UseMethod("as.dendrogram") :
no applicable method for 'as.dendrogram' applied to an object of class "c('integer', 'numeric')"
Is using cutree
the correct avenue to explore for plotting a restricted dendrogram? Is there any easier way to do this akin to matlab?
Just to clarify and provide some data... I do not want to drop any of the rows; instead of plotting/interpreting 831 branches, I would like to interpret 3 branches, and so would like the row dendrogram to be constrained to 3 branches (at height 150) and the corresponding heatmap of all 831 rows to be clustered into the 3 upper branches of the original dendrogram.
Without worrying about heatmap for the time being, the distance matrix and hclustering is done on the numeric matrix x
a plot of this resultant dendrogram reveals all ten branches,
but a cutoff height of 150 will restrain to only 3 branches
the dendrogram plotted with plot(rowDend) is what I would like to see on the row dendrogram for the following heatmap
But I can not find any way to restrain the row dendrogram in heatmap for the desired number of interpretable branches. Plotting all 831 branches is extremely messy.
The question is what do you mean when you write "selecting number of leaf nodes".
The Rowv parameter in heatmap.2 needs a dendrogram or a TRUE/FALSE value. From the help file:
So, when using
cutree(hr, k=5)
, you will get a vector of integer (telling you to which cluster each item belong to, in a cut that produces 5 clusters). Usingas.dendrogram
on it will not produce a dendrogram, hence:Rowv=as.dendrogram(cutree(hr, k=5))
, throws an error.IF you want to highlight some of the branches in your tree, for that I invite you to look into the dendextend package to see which solution works for you best. Here is an example that may be what you are asking for:
With the following output:
Consider also looking at the recently published tutorial of dendextend, you may want to work with the
branches_attr_by_labels
function (in the tutorial it is under the section: "Adjusting branches based on labels"), with the ability to manipulate dendrograms to create plots such as this:If what you want is to remove nodes, and leave only a few of them to be plotted, you should probably just create the heatmap for a subset of the data. You can also look at the
prune
function in dendextend (for the general purpose of looking at smaller dendrograms), but if you would want to use it for a heatmap, it is better to just work with a relevant subset of your data.