I have a list of vectors created by running:
import hcluster
import numpy as np
from ete2 import Tree
vecs = [np.array(i) for i in document_list]
where document_list is a collection of web documents I am analysing. I then perform hierarchical clustering:
Z = hcluster.linkage(vecs, metric='cosine')
This generates an ndarray such as:
[[ 12. 19. 0. 1. ]
[ 15. 21. 0. 3. ]
[ 18. 22. 0. 4. ]
[ 3. 16. 0. 7. ]
[ 8. 23. 0. 6. ]
[ 5. 27. 0. 6. ]
[ 1. 28. 0. 7. ]
[ 0. 21. 0. 2. ]
[ 5. 29. 0.18350472 2. ]
[ 2. 10. 0.18350472 3. ]
[ 47. 30. 0.29289577 9. ]
[ 13. 28. 0.29289577 13. ]
[ 73. 32. 0.29289577 18. ]
[ 26. 12. 0.42264521 5. ]
[ 5. 33. 0.42264521 12. ]
[ 14. 35. 0.42264521 12. ]
[ 19. 35. 0.42264521 18. ]
[ 4. 20. 0.31174826 3. ]
[ 34. 21. 0.5 19. ]
[ 38. 29. 0.31174826 21. ]]
Is it possible to convert this ndarray into a newick string that can be passed to the ete2 Tree() constructor so that I can draw and manipulate a newick tree using the tools provided by ete2?
Does it even make sense to try and do this and if not is there another way that I can generate a tree/dendrogram using the same data and ete2 (I realise that there are other packages that can draw dendrograms such as dendropy and hcluster itself but would prefer to use ete2 all the same)?
Thanks!
I use the following approach for pretty much the same thing:
Update: